Now that we understand what Ethereum is and the history behind it, let’s take a look at how Ethereum works. If you haven’t read our previous article, click here to learn more about the background behind the broader Ethereum network.
Similar to the Bitcoin network, Ethereum is a distributed state machine in which everything built, transacted, or made on the network itself must agree on the state of the network. This process is done using a Modified Merkle Patricia Tree data management method, sometimes called an “MPT” for short.
As a combination between a Patricia and Merkle tree, the Ethereum specification uses this method to save changes in the state. Merkle trees are data structures in which each of its non-leaf nodes ( Hash 0 and Hash 1) are hashes of its respective child nodes, (Hash 0-0, Hash 0-1) and so on. The leaf nodes, L1, L2, L3, and L4, are the lowest tier of nodes that store the actual data. As you can see from the exhibit below, the top hash is a combination of its child hashes.
Merkle Trees allow for a succinct representation of large amounts of data -- perfect for blockchains! This process takes an arbitrary number of hashes and represents it in a single hash in which the top hash serves as the fingerprint of the state for the entire data set.
On the other hand, a Patricia trie is a data structure that allows nodes who share the same prefix to also share the same path using keys as a path. As you can see below, everything underneath the ‘t’ branch leads to ‘to’ or ‘te.’ The ‘te’ node can further be broken down into ‘tea,’ ‘ted,’ or ‘ten.’ Furthermore, the combined keys in a specific path create the address. For example, they key for the non leaf node ‘inn’ would be ‘1159.’
A Merkle Patricia trie combines both ideologies of keys and hash synthesis. In a simplified manner, let’s use the example below. Starting from the top to the bottom, add a 0 for every path that goes left and 1 for every path that goes right. This would mean that the dark blue dot corresponds with the key “010,” and the light blue dot with “11.” Similarly, the very top hash is a representation of all the hashes below. In order for the dark blue dot to be validated, the hash of all its parent nodes must be combined until the top hash is effectively updated.
Each line segment corresponds to 16 possible outcomes that can be added to the keys, and uses a hexadecimal representation. To avoid thousands of unused nodes, MPT’s introduce special node types. If certain segments have corresponding keys in common, they are represented through special nodes.
In sum, Ethereum’s management structure allows accounts to easily verify that they have the same state as the state root itself. When any transaction is made, the state root is updated and miners or validators must verify that their hash is the same at the top hash. There are two types of accounts to be aware of: Contract accounts and Externally owned accounts.
An Ethereum contract account is a program that lives on the blockchain and is controlled by code. Each contract account has four properties that modify the states on the Ethereum network:
As a refresher from our previous article, a smart contract is a program that allows users to transact with each other according to a certain set of predetermined rules -- thus removing the need for a third party to enforce these rules.
Because of this, the ownership of a smart contract is entirely defined in a contracts code. This means there must be a function that allows the creator of the smart contract to perform admin operations (i.e destroy the contract, take all the ether out, prevent other people from removing funds, etc.)
Without this security within a smart contract, bad actors could easily hijack and take complete control.
Externally owned contracts include your public and private keys. These accounts carry two main properties: the nonce and balance. Because these contract accounts are controlled by private keys and have no code associated, the code and storage are null.
The encryption method on Ethereum works similar to that of Bitcoin, in which keys are created through the elliptical curve digital signature algorithm — a private key can derive a public key, but not the opposite. Think of your public key as a bank account number and private key as the password allowing you to log onto your account — ultimately proving you are the owner.
External accounts meanwhile can transfer or trigger contract accounts, or transact directly with other externally owned accounts. Similarly, contract accounts can also call on different functions — also known as trace functions. A ‘trace’ is a step towards the execution of a transaction.
Theoretically, through the use of trace functions, a contract could call on a number of other contracts - requiring heavy computing resources. On a shared and decentralized platform, how is it possible that users can call other contracts while still maintaining a balance in resources? What stops me from writing an infinite loop to crash the entire system?
The solution to this problem: Gas.
Every opcode (also called an EVM and machine readable instruction) on a smart contract, has a gas price. Gas, priced in gwei, is used to allocate resources of the ethereum virtual machine (EVM) so that decentralized applications, such as contract accounts, can execute autonomously. Having a separate unit for gas allows for a distinction between the actual value of the transaction, and the computational cost it takes for a function to run.
A gas limit refers to the maximum amount of gas you are willing to spend on a particular transaction. While there are cheap operations that require little to no gas, such as push, swap, and dupe), there are also expensive operations that use heavy computing power and take a larger amount of storage space. Generally speaking, a higher gas limit means you must do more work to execute a transaction — making the function expensive.
Let’s use an example. In the most simplistic terms, If Alice wants to transfer $100 to Bill, she must pay a $5 transaction and processing fee. While the $100 is the actual value of the transfer, $5 is the cost for performing the transaction. Similarly, if Alice were to transfer 50 ether’s to Bill, the gas price (measured in gwei) at the time might be 1/100,000 ethers (to make it equivalent). (Keep in mind that one unit of gwei is equal to 10^-9 ether (ETH). However, keep in mind that gas serves a greater purpose than a pure transaction fee -- it is the amount of effort it takes to execute certain operations, one tied to real computing power.
Once the node operators verify a transaction, they are awarded the fee for their computational services. If the fee isn’t high enough, they can then choose to ignore the transaction, and the same goes for the opposite. This creates a true free market since miners and users on the network are able to set their prices. However, this goes both ways: if miners on the network only accepted expensive operations no one would use the network. And if miners accepted all transactions, too many users might overpopulate the network.
Therefore, not only are senders allowed to pick their gas limits, but there are also block gas limits. Block gas limits set the absolute upper bound on how much computational throughput the blockchain handle. Thanks to gas, networks don’t have to worry about hackers infinitely looping contracts and breaking the network.
Let’s review everything we’ve learned so far. Ethereum is a blockchain, also known as a distributed state machine, that contains different types of accounts stored in a Merkle Patricia Trie. Some accounts are contracts which are made of opcodes. Users on the network pay for opcodes with gas, which is required to send transactions.
There you have it! You just learned about the Ethereum blockchain. Stay tuned for our next topic in the series — blocks and mining.