Yes, before you click away, you are currently on a blockchain blog. We know, the topic of this discussion may have us racking our brains for all our knowledge on botany; however, a Merkle tree is far from anything you’d find in nature.
In this post, we will talk about the use of Merkle trees in the context of blockchains and cryptocurrency networks. We will also consider why the Merkle tree plays such an important part in ensuring these networks are both efficient and reliable.
As you may recall, a blockchain is a decentralized payment network with a shared ledger (database) of transactions. Multiple computers on this network must keep copies of this ledger. As you can imagine, thousands of transactions occur over a short period of time.
That’s a lot of transactions too say the least! That isn’t it either. Remember that each of these transactions must then be verified by all the other computers on the network. So now you might be asking, how is a blockchain network able to handle such a large volume of data and verify each transaction? Moreso, how can the blockchain do this while maintaining the integrity of the data without large amounts of storage space on a network, or massive amounts of computing power?
Out Comes the Merkle Tree
This is where the Merkle tree comes into play. In simple terms, the beauty of the Merkle tree is its ability to take large amounts of input data and compress it into an output of fixed length. This output consists of just a single string of characters called a Merkle root. The Merkle root can easily act as proof of the validity of the data that it represents. It can also easily be shared with others on the network. This means that copies of the entire database do not need to be maintained by each and every user. If an error occurs, we do not need to do a line-by-line search of the entire database.
However, before we can understand how a Merkle tree works, we need to understand the basics of hash functions. This is because hash functions are the key technology underlying a Merkle tree.
Hash Functions Explained
Notice what happens if we only change one minor detail of the transaction and apply the same hash function. Our output is completely different.
The resulting hash value is unique to the input data and is like a fingerprint of the data. You probably noticed that the two output strings are very different despite having just a small change in the input data. Therefore, it is pretty easy to determine when a file is corrupted.
To illustrate how a hash function works, we can use a simple analogy. Please bear in mind that this is a very simplistic view of hash functions. In reality, hash functions are just slightly more complicated.
Hash Functions Are a Piece of Cake
Suppose you want to bake a cake. You would take a number of ingredients (the input), follow the list of instructions in the recipe (the mathematical algorithm of the hash function), and the result is a cake (the output or hash value). If you always use the same ingredients and follow the same recipe, you will always end up with the same cake. No matter how many times you do this, the result is always the same. Likewise, with the hash function, the same input data will always generate the same string of output characters (the Hash value), no matter how many times it is repeated.
However, as with baking a cake, you can vary the ingredients, follow the same recipe, and end up with a different cake. Likewise, with a hash function, changing the input data will produce a very different output value.
Just as a recipe can be applied to a different list of ingredients, one hash function (or algorithm) can be applied over and over again to different inputs. Each generating a different output. These outputs will then each with its own unique set of characters or fingerprints. In Bitcoin, the algorithm used is the SHA256 algorithm, which generates an output string which is 256 bits in length.
The interesting thing with a hash function is if I vary the input even slightly, the output can be a drastically different string of characters. This makes it very easy to detect corruption in the input data, thus making it easy to verify the integrity of the data. Likewise, with a cake, if we were to change just one ingredient, the result would be easily detected. For instance, if I add chocolate chips to my cake instead of walnuts, the difference is pretty obvious.
Hashing also saves storage. Instead of storing huge amounts of input data, you only need to store the hash value. Similar to the cake comparison, rather than storing an endless amount of ingredients, you only need to store the cake. Additionally, since I have the recipe, I can reproduce the cake. This also saves space in the pantry. Why can we say this?
The cake can always be duplicated by gathering the ingredients and following the same recipe using the same ingredients. The ingredients itself can be stored in many different places such as at a neighbor’s, a friend’s, or even with Grandma, and we can retrieve those ingredients only when we need them. Likewise, with a distributed network, we can use different sources to store smaller subsets of the data and do away with the need for central storage. And if we detect a problem or need to verify some data, we only need to retrieve the parts we need. This contributes to the efficiency of a blockchain.
The 4 Benefits of a Hash Function
Okay we get it there are so many great things about hash functions, how are we supposed to remember them all? A handy summary of course! Hash functions are useful for these four main reasons:
1. By applying a hash function, you can take a huge data set and compress it to a single output of fixed length (hash value). This will save storage space on the computer because one only needs to store the hash value, instead of the entire data set.
2. The hash value is unique to the input data and is therefore like a fingerprint. Even one minor change to the input will result in a vastly different output. This makes it easy to detect any corruption to the data set and to verify the integrity of the data.
3. It is deterministic; that is, no matter how many times you run the hash function with the same input data, it will always generate the same hash value.
4. It is a one-way function, meaning it is impossible to take the hash value and extract the input data from it; therefore, it is secure.
Now the moment you’ve all been waiting for. Let’s talk about the Merkle tree.
The Merkle Tree Concept
The concept of the Merkle tree dates back to 1979, before the existence of blockchain, and was named after Ralph Merkle, the computer scientist who patented the idea. He came up with the Merkle tree as a way for large amounts of data to be stored and verified in an efficient and secure way.
The key technology underlying a Merkle tree is the hash function, which serves the dual purpose of both encrypting the data and compressing it into a manageable size. The basic structure of a Merkle tree is like that of a reverse tree as shown in the following diagram:
As with any tree, there are leaves, branches, and a root. The leaves are the transactions (or blocks of transactions) that make up the input data. A hash function is applied to each block of transactions (T1, T2, T3, T4) to create what are called leaf nodes (W, X, Y, Z). The leaf nodes are the start of the Merkle tree.
In this example, we can concatenate (join together) two leaf nodes (W+X) to create a parent node(A) by applying the hash function again. We can do the same with leaf nodes (Y+Z) to create a parent node (B). We can then hash parent nodes A+B together to get what is called the Merkle Root. The result is an output made up of a string of characters with a fixed length. Now instead of storing a large set of data, we can represent that entire dataset with just the Merkle root.
A More Complicated Tree
The Merkle tree depicted above is a simple example. In reality, a Merkle tree can be much taller, having more levels, and much wider, incorporating more data sets. The process is the same. The hash function is applied repeatedly until we are left with just the Merkle root.
On a blockchain network, we can store the Merkle root in a secure place at a trusted source, and distribute the responsibility of storing the rest of the information at other sources on the network. Without revealing any actual data or requiring the contents of the entire database, we can verify subsets of the data very quickly just by matching each hash with the Merkle root. Any errors can be detected quickly by requesting the hash of a smaller subset until the corrupted block is found. This increases efficiency by not requiring that an entire database be searched for just one error.
What is a Merkle root?
To illustrate how a Merkel tree can simplify the data verification process in a distributed network, we figured an example was in order. Let’s look back at the same chart above for this example. Say I hold a copy of a subset of the data, T1, and I wanted to confirm that the data is correct. Without knowing the contents of the entire database, I can request the value of X and B from other sources on the network. With this, I can calculate the Merkle root.
So having T1, I can take its hash, and get W. Then I can take the hash of W and X to get A. Having A and knowing the value of B, I can take the hash of A and B to generate the Merkle root, which I can then compare to the original, thus verifying the integrity of data set T1, without actually knowing the contents of the rest of the database.
First of all, what is a DAG? The acronym DAG stands for the directed acyclic graph. Without going into too much detail, a DAG basically describes how information in a network of connected nodes are passed from one node to the next. Acyclic graphs are graphs that do not contain any cycles. In other words, there is no path for information to return to a node that has already been encountered. The directed part of the graph indicates that information is only directed in a single direction. This means the sequence goes from earlier to later. We could also call this kind of ordering topographical ordering.
A blockchain can be said to be an example of a simple directed acyclic graph. In a blockchain, there is a single chain of blocks, with each block referencing the previous one.
Note how in this digram the information is only directed in a single direction.
The Merkle Dag
First things first, a Merkle DAG is a Merkle directed acyclic graph. It has a similar structure to a Merkle tree but is not as strict. There are no balance requirements. Each node can contain data, and because several branches can reconverge, a node can have several parents. This is unlike a Merkle tree which is generally binary in nature. Where each parent is the hash of two children.
What makes DAG interesting in the world of blockchain is that its structure opens up the possibility of much faster processing times for transactions. In a standard blockchain, such as Bitcoin, there is only a single chain of blocks. All the transactions that occur at the same time are grouped into a block. Miners then compete for the validation of that block and once validated, the block is linked to the previous block. The average time for a block to be mined is about 10 minutes. As you can imagine, if this system were to be adopted by a wide community of users, this can lead to long transaction times and inefficiencies in the system, as well as a high cost for each transaction.
Why do we need DAG?
DAG allows for the possibility for multiple side chains, that branch off a single block, to exist. Branched chains are created when two miners produce a block at the same time. When this happens in a blockchain, an arbitrary period of time, called block time, is enforced to give the network time to consolidate and verify which block is correct. If a malicious branch is detected, it is orphaned or discontinued.
In a DAG, since multiple chains can be created at the same time, we can have different transactions running on different chains simultaneously. They can co-exist in parallel, as long as the information is directed in the same one way. This eliminates the need for block time to validate which block is correct. It also reduces the time wasted on orphaned branches.
Whereas blockchain relies on peer-to-peer confirmations of all transactions, confirmations within the DAG-chain occurs through the transactions themselves. Therefore, the more users there are, the faster the confirmation times are.
A Lesson on Verification
Verification only needs to come from the two closest nodes. This lessens the need for complex algorithms and allows for the elimination of the external mining process. The result is an almost instantaneous transaction speed and lowered costs for all users on the network.
DAG opens the way for the mass adoption of crypto. This is because it offers the potential for thousands of transactions to be verified every second. The process is also way cheaper for the user. For comparison, the average time for a block to be mined is 10 minutes, the average transaction time for a DAG network is almost instantaneous. This is especially important if cryptocurrencies are to have greater appeal to the masses and to financial institutions.
And there you have it. A lesson, not in botany, but in a very important tool in the world of blockchain technology. If you are to take away anything, remember a Merkle tree is especially useful in managing large amounts of data. It also ensures that that data can be efficiently retrieved and verified in a distributed network.
Finally, there can be variations to the Merkle tree concept, such as the Merkle DAG, which could allow for faster and even more complex forms of cryptocurrency networks. This means wide-scale adoption by the masses and greatly reduced transaction costs. Yes, the future is bright when you have trees, Merkle trees, that is.