In this article, we break down everything pertaining to blockchain hashing, and how this relate to cryptocurrencies like Bitcoin. First, we’ll look at what blockchain is and what are some blockchain applications. Following this, we’ll unpack some of the intricacies of hashing algorithms.

Blockchain technology is not necessarily monetary by its creation. To paraphrase Vitalik Buterin, the essence of blockchain is “informational and processual.” Blockchain technology could actually be used in a myriad of ways. Bitcoin is simply an effective application of the technology.

What is a Blockchain?

A blockchain is a linked list of transactions. This list contains data and a hash pointer to the previous block in the blockchain. A given blockchain functions based on the verification of a hash and digital signatures.

In a nutshell, hashing is a way for the blockchain to confirm its current state. Whereas the digital signature is necessary so transactions happen only once by the address owner.

Hashing as a Vital System

Cryptocurrencies like Bitcoin and Ethereum, primarily rely on two computational processes: hashing and blockchain technology. It’s the revolutionary application of these technologies that is making decentralized currency and peer-to-peer transactions secure and increasingly appealing.

In a blockchain, the hash of a previous block in a sequence is a tamper-proof sequence. Because as a function of the design, a hash is very sensitive. So, to change any variable of any one of the hashes in a given block would cause a domino effect. Subsequently, it would alter all of the previous transactions in the block. Blockchain hashes are deterministic because the input data will produce the same result each time.

How Bitcoin’s Algorithm Sets it Apart

Blockchain technology is not unique to cryptocurrencies, it is a technology that is common in many electronic transactions. However, Bitcoin’s algorithm has applied hashing and blockchain by relying on the participation of autonomous networks. All these networks take part in the production and confirmation of hash transactions. Hash transactions reach approval by way of proof-of-work. Additionally there are checks against the mutual consensus of the participating networks.

Once a transaction reaches approval, it joins the public ledger along with other approved transactions. One important application of blockchain technology is that currencies can be decentralized and peer-to-peer. To be sure, blockchain is not always a decentralized entity, as third parties such as banks and credit card companies utilize this technology as well.

The Benefits of a Public Ledger

Simply put, Blockchain is a public decentralized ledger. The benefit of such a system is that it has great potential to be monitored by multiple beneficent networks, rather than rely exclusively on a trusted third party and centralized currencies.

At its best, blockchain technology applied to cryptocurrency makes a reduction of corruption within both decentralized and centralized currencies possible. This is because it relies on a fellowship of participation, rather than resting solely on traditional financial institutions. Bitcoin is simply an example of a cryptocurrency that trades on the technology of hashing and blockchain, with the central goal of establishing a modern decentralized cryptocurrency.

The Example of Stone Rai

Critical to the legitimacy of a cryptocurrency is the public ledger that blockchain relies on. Here is a fun example of a long lasting but obscure currency the Stone Rai. The Stone Rai is a longstanding currency of the Micronesian island of Yap. Rai are large doughnut-like stones that represent wealth as well as the exchange of wealth.

But it was not only valuable to have the 3.5 meter stone in your possession, but the record of the transactions themselves are also equally valuable. Historical records indicate that during the transportation of these large stones they would get lost at sea. However, this did not diminish the value of the stone, nor did it necessarily void the value of exchange, because the record of the transaction was just as valuable. And given the Rai’s mutually accepted value by the Yap, the physical object did not need to be present in order to maintain is fiat value.

When Bitcoin Mining Ends

Similarly, with Bitcoin and cryptocurrencies, without the ability for networks to express consensus in the public ledger, a cryptocurrency has no value. Bitcoin, like gold, is a limited resource. Currently, the system is designed to yield 21 million Bitcoins by 2040. Once all of the hashes have been mined, Bitcoin mining will cease. At the point of capacity, the currency will only be usable and tradable for its exchange value. At present, as Bitcoin mining continues, it will become more and more scarce and thus more valuable.

So, in order for a decentralized currency like Bitcoin to work, it not only depends on the reliability of the blockchain, it also relies on users possessing an equilibrium of rationality, self-interest, and altruism. That is to say, that given the investment and computational power necessary for mining Bitcoin, there needs to be a future value. However, the system’s reliability also depends on an established harmony between the needs of the individual and the health of the system to align. Therefore networks need to cooperate and collaborate in order for the system to thrive.

Adding to the Blockchain

In order to add to the blockchain, miners must solve for what is called the target hash. Meeting or solving a hash uses an algorithm that relies on the data from the blockheader. Each block contains a blockheader with the number of the block, a timestamp of the transaction as well as the hash of the previous block which contains the nonce.

What is a Nonce?

The nonce is a “number only used once,” in a Bitcoin block which is a 32-bit (4-byte) numerical string. The value of the nonce is adjusted by miners so that the hash of the block will be less than or equal to the current target of the network. As this iterative calculation requires time and resources, the presentation of the block with the correct nonce value constitutes what’s known as proof-of-work.

What is a Genesis Block?

The Genesis Block is the first transaction in the block that starts a new electronic transaction (or coin in the case of Bitcoin). A crucial function of the blockchain is that it relies on hash pointers which contain the address of the previous block, as well as the hash of the new data. The block is similar to a sequence of chain links, where each link is connected to the other via its antecedent link.

Blocks, however, are connected to the previous block using a pointer. A pointer does not store the actual hash value, but “points” to an address of variables. Data from the previous blocks are hashed (or encrypted) into a new unique series of letters and numbers of a fixed length.

Continuing, the nonce has a high min-entropy because the variables are chosen from a large distribution. This also means that the nonce is a randomly generated string. High min-entropy means that there is a low likelihood of randomly generating the hash. The nonce is then added to a hashed block, thus a miner is only successful if they meet the target hash, at which point the nonce is added to a hashed block.

PoW and Hashing

In order for new blocks to be accepted to an extent blockchain, a proof-of-work must be generated. The proof-of-work is composed of letters and numbers fixed according to the desired outcome. This is expressed by the double SHA-256 hashing algorithm.

That means that once the target hash has been obtained, then the block is accepted into the public ledger by the consensus of other participating networks.

Because the blockchain only contains those transactions that have been validated, this prevents fraudulent transactions being added to the chain, or the problem of double-spending by reusing the same transaction twice.

The True Importance of Blockchain Tech

This is why cryptocurrencies like Bitcoin depends on the features of blockchain technology. Using a block of hashes in an interdependent sequence replaces the need for a trusted third party. However, by using blockchain, a publicly published list of transactions functions as the guarantor. This is done through the public system of participation that represents a vouchsafe for its own authenticity. Or more correctly, this guarantee happens by virtue of the individual networks of the established blockchain.

So in this sense, a blockchain like Bitcoin, is self-guaranteeing, because there are multiple networks continuously approving the transactions. Then the currency can be used successfully in a transaction and/or exchange of goods. The Bitcoin blockchain is thus a public ledger that is composed of blocks which have been successfully hashed and thus added to the list of transactions that have been mutually approved by independent networks.

What is Hash and Hashing?

A hashing algorithm is a computational function that condenses input data into a fixed size, the result of which is the output called a hash or a hash value. Hashes are used to identify, compare or run calculations against files and strings of data. Typically, the program first computes a hash and then compares the values to the original files.

If you didn’t love doing in math in school, that is ok, because while hashing relies on some pretty crazy Alan Turing-esque computations, a computer program does the all the math for you. So all you need to remember from math class are the basics of exponents and probability functions.

Example of Blockchain Hashing

A basic example of hashing is used to digitally sign a piece of software so that it is available for download. To do this you need a hash of the script of the program you want to download. You also need a digital signature, which gets hashed as well.

Once the input data has been hashed the software is encrypted and it can be downloaded. So, when someone downloads software, the browser needs to decrypt the file and check the two unique hash values. The browser then runs the same hash function, using the same algorithm and hashes both the file and the signature again. If the browser successfully produces the same hash value, it can confirm that both the signature and the file are authentic and that have not been altered.

The Meaning of Deterministic in Hash Values

Hash values are deterministic and respond to the parameters of the given variables of the algorithm. So, the reason hashing is so useful for cryptocurrencies is that the same sequence cannot be reproduced with a different data set as the input.

The resultant hash of the input of data and is both unique and irreversible. For example, an input of “123” will always have the same output -if it didn’t and it came up with a different output every time it was hashed, then there would be no consistency or validity to the process because your programs would never speak the same language. The hash used for Bitcoin is a 65-digit-hexadecimal number -which I will explain shortly.

Digital Signatures

Hashing also requires the use of unique digital signatures. For example, SSL certificates (SSL/TLS Protocol) have a role in what makes possible secure data transmission from one device to another. Digital signatures bind a key to a dataset. SSL signatures are encrypted authentication received data. SSL Certificates, therefore, need to match a specific public key to the intended transaction -kind of like a lock and key.

SSL/TLS uses asymmetric encryption which makes secure key exchanges possible. The security of this transfer relies on two keys. A public key used for encryption, and a private key for the recipient’s decryption. Digital signatures are very sensitive, and small changes result in a very different hash generation.

SHA-256

Presently SHA-256 is the most secure hashing function. This function expresses the possible combinations or values that results from the given input data. SHA stands for Secure Hashing Function, and 256 expresses the numerical quantity of the fixed bit length. This means that the target is correct 256 bit, and as mentioned, Bitcoin uses a 65-hexadecimal hash value.

Using the SHA-256 function makes it (nearly) impossible to duplicate a hash. That’s because there are just too many combinations to try and process. Therefore, a significant amount of computational work is required. So much so that Bitcoins are no longer mined with personal computers and presently rely on Application Specific Integrated Circuits or ASICs. Achieving this target has the probability of  2^256. If you remember your exponents, you will deduce this is an incredibly difficult variable to hit.

Furthermore, using this hash function means that such a hash is intentionally computationally impractical to reverse. The intentional result is that requires it a random or brute-force method to solve for the input.

A Case in Point

Consider the following, if I have 1 six-sided dice, I have a 1 in 6 chance of rolling a 6. However, the more sides my dice has (say 256 sides), the more my chances of rolling a 6 decrease. That’s 1 in 256, which is still better than your odds of using brute-force on an extent hash.

A hash rate is then the speed at which hashing operations take place during the mining process. If the hash rate gets too high and miners solve the target has too quickly, this increases the potential for a collision. When that happens, the difficulty of the hash needs adjusting accordingly. For example, at present, a Bitcoin is mined/hashed about every 10 minutes.

Collision Resistance

Due to the complexity and sensitivity of SHA-256, reversing the hash sequence in an effort to find the original input data is basically impossible. The difficulty of meeting SHA-256 means that this hash is extremely secure because it is“collision resistant.” Collision resistance expresses the likelihood of two different networks solving the same hash at the same chance is minuscule.

Therefore, given the possible permutations of SHA-256, the probability of a collision is negligible. Below is a comparison of two different hash outcomes. The first only uses the single hash function (SHA-1), while the second uses the double hash function (SHA-256). And as you can see, the double hash function produces a much more complicated hash and as a consequence is far more collision resistant.

Here are a few examples of other cryptographic hash functions and when collision resistance broke, and it will become evident why SHA-256 is currently the favored hash:

  • MD 5: It produces a 128-bit hash.
    • Collision resistance was broken after ~2^21 hashes.
  • SHA 1: Produces a 160-bit hash.
    • Collision resistance broke after ~2^61 hashes.
  • SHA 256: Produces a 256-bit hash.
    • This is currently being used by Bitcoin.
  • Keccak-256: Produces a 256-bit hash.
    • Currently used by Ethereum.

Merkle Tree and Merkle Roots

As blocks continue to be added to an increasing blockchain, there becomes a need to reclaim storage space; this is the role of the Merkle Tree. Rather than storing the entire transaction, only the root of the hash is stored (the Merkle Root), thus it is still possible to verify the blockchain without sorting through all of the data.

Verification processes are therefore simplified by following a branch that links transactions to the block it was time-stamped in, but not the complete transaction itself (as these take up a lot of space). Instead, the check is to ensure that the previous network node accepted the transaction. This continually affirms its reliability and so subsequent blocks are added to the chain and the value can be traded on.

How is this done? The Merkle Root summarizes all of the data in the related transactions and is stored in the block header. Just as we saw is the case for hashes, if a single detail in any of the transactions is altered, so is the Merkle Root. Using a Merkle tree makes testing a to see if a specific transaction is included in the set or not much more efficient then going through all of the blocks in the chain.

The Creation of Merkle Trees

Merkle trees originate by repeatedly hashing pairs of nodes until there is only one hash left. This hash is the Root Hash, or the Merkle Root. Each leaf node is a hash of transactional data, and each non-leaf node is a hash of its previous hashes. Merkle trees are binary and therefore require an even number of leaf nodes. If the number of transactions is odd, the last hash will become a duplicate once to create an even number of leaf nodes.

I will borrow an example from Shaan Ray. Here are four transactions in a block: A, B, C, and D. Each of these is hashed, and the hash stored in each leaf node, resulting in Hash A, B, C, and D. Consecutive pairs of leaf nodes are then summarized in a parent node by hashing Hash A and Hash B, resulting in Hash AB, and separately hashing Hash C and Hash D, resulting in Hash CD. The two hashes (Hash AB and Hash CD) are then hashed again to produce the Root Hash (the Merkle Root).

Conclusion and Review

For some, cryptocurrencies may seem too ephemeral to trust, but the basic idea of currency like Bitcoin relies on typical monetary practices of a fiat system. In fact, currently, similar monetary systems are used frequently, as many transactions and bank balances rely on data rather than the physical presence of hard currency (like gold).

A crucial difference in the application of blockchain in terms of cryptocurrencies is that typically an exchange of currency requires a third party as guarantor. This would similar to a bank or credit card company. However, the application of blockchain technology in cryptocurrencies is disrupting the need for a third party, as well as making non-cash peer-to-peer transactions more secure and desirable.

Here is what you should take away from this article, What is Blockchain Hashing?:

Blockchain Technology

  • Blockchain as a public ledger: By relying on individual networks, transactions are hashed and added to a public record. Participatiing networks maintain and thus approve the record. The blockchain ledger is essential in order to maintain validity as a currency. Additionally, it makes the reliability of decentralized cryptocurrencies possible.
  • Adding blocks to the blockchain: To add to the blockchain, miners mine for the target hash. This relies on data from the blockheader. Each blockheader contains versions of the number of the block, a timestamp of the transaction and the hash of the previous block, which contains the nonce.
  • Proof-of-work: Once the target-hash has been solved, a proof-of-work is produced. This is related to the SHA-256 and the current level of difficulty for solving it.
  • Nonce: The nonce is a “number only used once,” in a Bitcoin block is a 32-bit (4-byte). The value of the nonce is adjusted by miners. So the hash of the block will be less than or equal to the current target of the network.

Blockchain Hashing

Hash Algorithms

Hash algorithms are computational functions. The process condenses input data into a fixed size, the resulting is an output that is a hash or a hash value. Hashes identify, compare or run calculations against files and strings of data. When attempting to add to an extent blockchain, the program must solve for the target-hash in order for it to reach acceptance as a new block.

  • Hashes are deterministic and pre-image resistant:
    • Deterministic: the outcome of a particular set of data input will always have the same result. This makes it possible to keep track of transactions and nearly impossible to recreate the input from the output data (or pre-image resistant).  
  • SHA-256:
    • This produces a 256-bit hash. Given that data is so large, that there are too many possible outcomes to compare hashes to and attempt to solve backward. So if one wanted to try to solve for a target hash, this they would need to begin with a random hash a sequence -then test it against the target hash – this would take a nearly incalculable amount of times.
    • Collision Resistance: SHA-256 is collision resistant because of the large amount of data, so arriving the same target-hash at the same time is nearly impossible. This is also a result of using a target with high min-entropy.

Merkle Trees and Roots

The Merkle Root summarizes all of the data in the related transactions and is stored in the block header. Just as we saw with hashes, if a single detail in any of the transactions is altered, so is the Merkle Root. Using a Merkle Tree makes testing to see if a specific transaction is included in the set efficient -rather than going through all of the blocks.

Computational Functions

Hash algorithms are computational functions. The process condenses input data into a fixed size. The result is an output called a hash or a hash value. Hashes are used to identify, compare or run calculations against files and strings of data. When attempting to add to an extent blockchain, the program must solve for the target-hash in order to be accepted as a new block.

Features of Hashes

  • Deterministic: the outcome of a particular set of data input will always have the same result. This makes it possible to keep track of transactions. It also makes it nearly impossible to recreate the input from the output data (or pre-image resistant).
  • SHA-256:
    • This produces a 256-bit hash. Given that data is so large, that there are too many possible outcomes to compare hashes to and attempt to solve backward. So if one wanted to try to solve for a target hash, they would need to begin with a random hash sequence. Then test it against the target hash. This would take a nearly incalculable amount of time.
    • Collision Resistance: SHA-256 is collision resistant because of the large amount of data, so arriving the same target-hash at the same time is nearly impossible. This is also a result of using a target with high min-entropy.
  • Merkel Trees and Merkel Roots:
    • The Merkle Root summarizes all of the data in the relevant transactions and stores in the block header. Just as we saw with hashes, if a single detail in any of the transactions changes, so goes the Merkle Root. Using a Merkle Tree makes testing for specific transaction more efficient. Much more so than going through all of the blocks.


SHA-256:

  • This produces a 256-bit hash. Given that data is so large, there are too many possible outcomes to compare hashes to; this makes solving backward nearly impossible. So if one wanted to try to solve for a target hash, they would need to begin with a random hash sequence, then test it against the target hash. This would take a nearly incalculable amount of time.
  • Collision Resistance: SHA-256 is collision resistant because of the large amount of data. So arriving at the same target-hash at the same time is nearly impossible. This is also a result of using a target with high min-entropy.

Sources:
http://kddlab.zjgsu.edu.cn:7200/research/blockchain/hehonghao-reference/34-Blockchain%20Technology_%20Principles%20and%20Applications.pdf
https://www.thesslstore.com/blog/difference-shttp://kddlab.zjgsu.edu.cn:7200/research/blockchain/hehonghao-reference/34-Blockchain%20Technology_%20Principles%20and%20Applications.pdfha-1-sha-2-sha-256-hash-algorithms/
https://bitcoin.org/bitcoin.pdf
https://www.techopedia.com/definition/3470/concatenation-programming
https://hackernoon.com/merkle-trees-181cb4bc30b4