In this article, I go through some specifics of the relationship between *blockchain *and *hashing*, and how these are being used for cryptocurrencies like *Bitcoin*. First I look at what blockchain is and how it is being applied. Following this, I unpack some of the intricacies of hashing algorithms.

Blockchain technology is not necessarily monetary by its creation, rather *hash pointer *to the previous block in the blockchain. A given blockchain functions based on the verification of a *hash *and *digital signatures*. Hashing is a way for the blockchain to confirm its current state, therefore a digital signature is required so that the transaction is only made once by the owner of the address and approved by the recipient.

## Hashing as a Vital System

Cryptocurrencies like Bitcoin and Ethereum, primarily rely on two computational processes: *hashing *and *blockchain technology*. It is the revolutionary application of these technologies that is making decentralized currency and peer-to-peer transactions secure and increasingly appealing.

In a blockchain, the hash of a previous block in a sequence is a tamper-proof sequence because as a function of the design, a hash is very sensitive. So, to change any variable of any one of the hashes in a given block would cause a domino effect, altering all of the previous transactions in the block. Blockchain hashes are defined as *deterministic *because the input data will produce the same result each time.

Blockchain technology is not unique to cryptocurrencies, it is a technology that is used in many electronic transactions. However, Bitcoin’s algorithm has applied hashing and blockchain by relying on the participation of autonomous networks, all of which are taking part in the production and confirmation of *hash transactions*. Hash transactions are approved by way of *proof-of-work* as well as checked against the mutual consensus of the participating networks. Once a transaction is approved it is can be added to the public ledger along with other approved transactions. As mentioned, an important application of blockchain technology is that currencies can be decentralized and peer-to-peer. To be sure, blockchain is not always decentralized, as third parties like banks and credit card companies utilize this technology as well.

Simply put, Blockchain *is *a *public decentralized ledger*. The benefit of such a system is that it has great potential to be monitored by multiple beneficent networks, rather than rely exclusively on a trusted third party and centralized currencies. At its best, blockchain technology applied to cryptocurrency makes a reduction of corruption within both decentralized *and *centralized currencies possible. This is because it relies on a fellowship of participation, rather than resting solely on traditional financial institutions. Bitcoin is simply an example of a cryptocurrency that trades on the technology of hashing and blockchain, with the central goal of establishing a modern decentralized cryptocurrency.

Critical to the legitimacy of a cryptocurrency is the public ledger that blockchain relies on. Here is a fun example of a long lasting but obscure currency the Stone Rai. The Stone Rai is a longstanding currency of the Micronesian island of Yap. *Rai *are large doughnut-like stones that represent wealth as well as the exchange of wealth. But it was not only valuable to have the 3.5 meter stone in your possession, but the record of the transactions themselves are also equally valuable. Historical records indicate that during the transportation of these large stones they would get lost at sea. However, this did not diminish the value of the stone, nor did it necessarily void the value of exchange, because the *record *of the transaction was just as valuable. And given the Rai’s mutually accepted value by the Yap, the physical object did not need to be present in order to maintain is fiat value.

Similarly, with Bitcoin and cryptocurrencies, without the ability for networks to express consensus in the public ledger, a cryptocurrency has no value. Bitcoin, like gold, is a limited resource. Currently, the system is designed to yield 21 billion Bitcoins by 2040, and once all of the hashes have been mined Bitcoin mining will cease to occur. At the point of capacity, the currency will only be used and traded for its exchange value. At present, as Bitcoin continues to be mined, it will become more and more scarce and thus more valuable.

So, in order for a decentralized currency like Bitcoin to work, it not only depends on the reliability of the blockchain, it also relies on users possessing an equilibrium of rationality, self-interest, and altruism. That is to say, that given the investment and computational power necessary for mining Bitcoin, there needs to be a future value. However, the system’s reliability also depends on an established harmony between the needs of the individual and the health of the system to align. Therefore networks need to cooperate and collaborate in order for the system to thrive.

**Adding to the Blockchain**

In order to add to the blockchain, miners must solve for what is called the *target hash*. Meeting or solving a hash uses an algorithm that relies on the data from the blockheader. Each block contains a blockheader with the number of the block, a timestamp of the transaction as well as the hash of the previous block which contains the *nonce*.

The nonce is a “number only used once,” in a Bitcoin block which is a 32-bit (4-byte) numerical string. The value of the nonce is adjusted by miners so that the hash of the block will be less than or equal to the current target of the network. As this iterative calculation requires time and resources, the presentation of the block with the correct nonce value constitutes what’s known as proof-of-work.

The *Genesis Block* is the first transaction in the block that starts a new electronic transaction (or coin in the case of Bitcoin). A crucial function of the blockchain is that it relies on hash pointers which contain the address of the previous block, as well as the hash of the new data. The block is similar to a sequence of chain links, where each link is connected to the other via its antecedent link. Blocks, however, are connected to the previous block using a *pointer*. A pointer does not store the actual hash value, but “points” to an address of variables. Data from the previous blocks are hashed (or encrypted) into a new unique series of letters and numbers of a fixed length.

Continuing, the nonce has a *high min-entropy* because the variables are chosen from a large distribution. This also means that the nonce is a randomly generated string. High min-entropy means that there is a low likelihood of randomly generating the hash. The nonce is then added to a hashed block, thus a miner is only successful if they meet the target hash, at which point the nonce is added to a hashed block.

In order for new blocks to be accepted to an extent blockchain, a *proof-of-work* must be generated. The proof-of-work is composed of letters and numbers fixed according to the desired outcome. This is expressed by the double SHA-256 hashing algorithm.

That means that once the target hash has been obtained, then the block is accepted into the public ledger by the consensus of other participating networks.

Because the blockchain only contains those transactions that have been validated, this prevents fraudulent transactions being added to the chain, or the problem of double-spending by reusing the same transaction twice.

This is why cryptocurrencies like Bitcoin depends on the features of blockchain technology; using a block of hashes in an interdependent sequence replaces the need for a trusted third party. However, using blockchain, publicly published list of transactions functions as the guarantor, by developing a public system of participation that is itself the vouchsafe of its own authenticity. Or more correctly is continually guaranteed by individual networks of the established blockchain.

So in this sense, a blockchain like Bitcoin, is self-guaranteeing, because there are multiple networks continuously approving the transactions. Then the currency can be used successfully in a transaction and/or exchange of goods. The Bitcoin blockchain is thus a public ledger that is composed of blocks which have been successfully hashed and thus added to the list of transactions that have been mutually approved by independent networks.

**Hash and Hashing**

A hashing algorithm is a computational function that condenses input data into a fixed size, the result of which is the output called a *hash *or a *hash value*. Hashes are used to identify, compare or run calculations against files and strings of data. Typically, the program first computes a hash and then compares the values to the original files.

If you didn’t love doing in math in school, that is ok, because while hashing relies on some pretty crazy Alan Turing-*esque* computations, a computer program does the all the math for you. So all you need to remember from math class are the basics of exponents and probability functions.

A basic example of hashing is used to digitally sign a piece of software so that it is available for download. To do this you need a hash of the script of the program you want to download. You also need a digital signature, which gets hashed as well. Once the input data has been hashed the software is encrypted it can be downloaded. So, when someone downloads software, the browser needs to decrypt the file and check the two unique hash values. The browser then runs the same hash function, using the same algorithm and hashes both the file and the signature again. If the browser successfully produces the same hash value, it can confirm that both the signature and the file are authentic and that have not been altered.

Hash values are *deterministic *and respond to the parameters of the given variables of the algorithm. So, the reason hashing is so useful for cryptocurrencies is that the same sequence cannot be reproduced with a different data set as the input. The resultant hash of the input of data and is both unique and irreversible. For example, an input of “123” will always have the same output -if it didn’t and it came up with a different output every time it was hashed, then there would be no consistency or validity to the process because your programs would never speak the same language. The hash used for Bitcoin is a 65-digit-hexadecimal number -which I will explain shortly.

**Digital Signatures**

Hashing also requires the use of unique digital signatures. For example, SSL certificates (SSL/TLS Protocol) have a role in what makes possible secure data transmission from one device to another. Digital signatures bind a key to a dataset. SSL signatures are encrypted authentication received data. SSL Certificates, therefore, need to match a specific public key to the intended transaction -kind of like a lock and key.

SSL/TLS uses asymmetric encryption which makes secure key exchanges possible. The security of this transfer relies on two keys: a public key used for encryption, and a private key for the recipient’s decryption. Digital signatures are very sensitive, and small changes result in a very different hash generation.

**SHA-256**

Presently SHA-256 is the most secure hashing function. This function expresses the possible combinations or values that results from the given input data. SHA stands for *Secure Hashing Function*, and 256 expresses the numerical quantity of the fixed bit length. This means that the target is correct 256 bit, and as mentioned, Bitcoin uses a 65-hexadecimal hash value.

Using the SHA-256 function makes it (nearly) impossible to duplicate a hash because there are just too many combinations to try and process. Therefore, a significant amount of computational work is required -really significant, so much so that Bitcoins are no longer mined with personal computers and presently rely on *Application Specific Integrated Circuits* or ASIC. Achieving this target has the probability of 2^256, if you remember your exponents, you will deduce this is an incredibly difficult variable to hit.

Furthermore, using this hash function means that such a hash is intentionally computationally impractical to reverse and as the intentional result that requires a random or *brute-force* method to solve for the input.

Consider the following, if I have 1 six-sided dice, I have a 1 in 6 chance of rolling a 6. However, the more sides my dice has (say 256 sides), my chances of rolling a 6 get a whole lot lower (that’s 1 in 256: which is still better than your odds of using brute-force on an extent hash).

A *hash rate* is then the speed at which hashing operations take place during the mining process. If the hash rate gets too high and miners solve the target has too quickly, increasing the potential for a collision, and indicating that the difficulty of the hash needs to be adjusted accordingly. For example, at present, a Bitcoin is mined/hashed about every 10 minutes.

**Collision Resistance**

Due to the complexity and sensitivity of SHA-256, reversing the hash sequence in an effort to find the original input data is basically impossible. The difficulty of meeting SHA-256 means that this hash is extremely secure because it is“collision resistant.” Collision resistance expresses the likelihood of two different networks solving the same hash at the same chance is minuscule.

Therefore, given the possible permutations of SHA-256, the probability of a collision is negligible. Below is a comparison of two different hash outcomes. The first only uses the single hash function (SHA-1), while the second uses the double hash function (SHA-256). And as you can see, the double hash function produces a much more complicated hash and as a consequence is far more collision resistant.

Here are a few examples of other cryptographic hash functions and when collision resistance broke, and it will become evident why SHA-256 is currently the favored hash:

- MD 5: It produces a 128-bit hash.
- Collision resistance was broken after ~2^21 hashes.

- SHA 1: Produces a 160-bit hash.
- Collision resistance broke after ~2^61 hashes.

- SHA 256: Produces a 256-bit hash.
- This is currently being used by Bitcoin.

- Keccak-256: Produces a 256-bit hash.
- Currently used by Ethereum.

**Merkle Tree and Merkle Roots**

As blocks continue to be added to an increasing blockchain, there becomes a need to reclaim storage space; this is the role of the Merkle Tree. Rather than storing the entire transaction, only the root of the hash is stored (the Merkle Root), thus it is still possible to verify the blockchain without sorting through all of the data. Verification processes are therefore simplified by following a branch that links transactions to the block it was time-stamped in, but not the complete transaction itself (as these take up a lot of space). Instead, the check is to ensure that the previous network node accepted the transaction. This continually affirms its reliability and so subsequent blocks are added to the chain and the value can be traded on.

How is this done? The Merkle Root summarizes all of the data in the related transactions and is stored in the block header. Just as we saw is the case for hashes, if a single detail in any of the transactions is altered, so is the Merkle Root. Using a Merkle tree makes testing a to see if a specific transaction is included in the set or not much more efficient then going through all of the blocks in the chain.

Merkle trees are created by repeatedly hashing pairs of nodes until there is only one hash left (this hash is called the Root Hash, or the Merkle Root). Each leaf node is a hash of transactional data, and each non-leaf node is a hash of its previous hashes. Merkle trees are binary and therefore require an even number of leaf nodes. If the number of transactions is odd, the last hash will be duplicated once to create an even number of leaf nodes.

I will borrow an example from Shaan Ray. Here are four transactions in a block: A, B, C, and D. Each of these is hashed, and the hash stored in each leaf node, resulting in Hash A, B, C, and D. Consecutive pairs of leaf nodes are then summarized in a parent node by hashing Hash A and Hash B, resulting in Hash AB, and separately hashing Hash C and Hash D, resulting in Hash CD. The two hashes (Hash AB and Hash CD) are then hashed again to produce the Root Hash (the Merkle Root).

**Conclusion and Review **

For some, cryptocurrencies may seem too ephemeral to trust, but the basic idea of currency like Bitcoin relies on typical monetary practices of a fiat system. In fact, currently, similar monetary systems are used frequently, as many transactions and bank balances rely on data rather than the physical presence of hard currency (like gold). A crucial difference in the application of blockchain in terms of cryptocurrencies is that typically an exchange of currency requires a third party as guarantor; like a bank or credit card company. However, the application of blockchain technology in cryptocurrencies is disrupting the need for a third party, as well as making non-cash peer-to-peer transactions more secure and desirable.

Here is what you should take away from this article, *What is Blockchain Hashing?*:

#### Blockchain technology

- Blockchain as a public ledger: by relying on individual networks transactions are hashed and added to a public record. This record is maintained by the approval of the participating networks. The blockchain ledger is essential in order to maintain the validity as a fiat currency and makes the reliability of decentralized cryptocurrencies possible.
- Adding blocks to the blockchain: To add to the blockchain, miners mine for the target hash. This relies on data from the blockheader, which contains versions of the number of the block, a timestamp of the transaction and the hash of the previous block which contains the nonce.
- Proof-of-work: Once the target-hash has been solved, a proof-of-work is produced, which is related to the SHA-256 and the current level of difficulty for solving it.
- Nonce: The nonce is a “number only used once,” in a Bitcoin block is a 32-bit (4-byte). The value of the nonce is adjusted by miners so that the hash of the block will be less than or equal to the current target of the network.

#### Hashing

- Hash algorithms are computational functions. The process condenses input data into a fixed size, the resulting is an output called a hash or a hash value. Hashes are used to identify, compare or run calculations against files and strings of data. When attempting to add to an extent blockchain, the program must solve for the target-hash in order to be accepted as a new block.
- Hashes are deterministic and pre-image resistant:
- Deterministic: the outcome of a particular set of data input will always have the same result. This makes it possible to keep track of transactions and nearly impossible to recreate the input from the output data (or pre-image resistant).

- SHA-256:
- This produces a 256-bit hash. Given that data is so large, that there are too many possible outcomes to compare hashes to and attempt to solve backward. So if one wanted to try to solve for a target hash, this they would need to begin with a random hash a sequence -then test it against the target hash – this would take a nearly incalculable amount of times.
- Collision Resistance: SHA-256 is collision resistant because of the large amount of data, so arriving the same target-hash at the same time is nearly impossible. This is also a result of using a target with high min-entropy.

- Merkel Trees and Merkel Roots:
- The Merkle Root summarizes all of the data in the related transactions and is stored in the block header. Just as we saw with hashes, if a single detail in any of the transactions is altered, so is the Merkle Root. Using a Merkle Tree makes testing a to see if a specific transaction is included in the set efficient -rather than going through all of the blocks.
- Hash algorithms are computational functions. The process condenses input data into a fixed size, the resulting is an output called a hash or a hash value. Hashes are used to identify, compare or run calculations against files and strings of data. When attempting to add to an extent blockchain, the program must solve for the target-hash in order to be accepted as a new block.

- Hashes are deterministic and pre-image resistant:
- Deterministic: the outcome of a particular set of data input will always have the same result. This makes it possible to keep track of transactions and nearly impossible to recreate the input from the output data (or pre-image resistant).

- SHA-256:
- This produces a 256-bit hash. Given that data is so large, that there are too many possible outcomes to compare hashes to and attempt to solve backward. So if one wanted to try to solve for a target hash, this they would need to begin with a random hash a sequence -then test it against the target hash – this would take a nearly incalculable amount of times.
- Collision Resistance: SHA-256 is collision resistant because of the large amount of data, so arriving the same target-hash at the same time is nearly impossible. This is also a result of using a target with high min-entropy.

- Merkel Trees and Merkel Roots:
- The Merkle Root summarizes all of the data in the related transactions and is stored in the block header. Just as we saw with hashes, if a single detail in any of the transactions is altered, so is the Merkle Root. Using a Merkle Tree makes testing a to see if a specific transaction is included in the set efficient -rather than going through all of the blocks.

Sources:

http://kddlab.zjgsu.edu.cn:7200/research/blockchain/hehonghao-reference/34-Blockchain%20Technology_%20Principles%20and%20Applications.pdf

https://www.thesslstore.com/blog/difference-shttp://kddlab.zjgsu.edu.cn:7200/research/blockchain/hehonghao-reference/34-Blockchain%20Technology_%20Principles%20and%20Applications.pdfha-1-sha-2-sha-256-hash-algorithms/

https://bitcoin.org/bitcoin.pdf

https://www.techopedia.com/definition/3470/concatenation-programming

https://hackernoon.com/merkle-trees-181cb4bc30b4