SHA256 and Bitcoin Mining Walkthrough

Published in Bitcoin Magazine

Translations: French šŸ‡«šŸ‡· German šŸ‡©šŸ‡Ŗ Slovenian šŸ‡øšŸ‡® Russian šŸ‡·šŸ‡ŗ

How mining works is fascinating. When I explain it to people, I enjoy seeing their face the moment their mind is blown. Iā€™ll explain it here, but just know, Iā€™m imagining all your faces as your minds blow!

I have to start with hash functions. Without hash functions, Bitcoin would not be possible. Let me explain what they are first, not only so you can sound cool at parties, but also because itā€™s fundamental to understanding how Bitcoin works, particularly mining, but also transactions under the hood (If interested to learn how to perform a SHA256 hash with pen and paper, see this guide).

You donā€™t need to understand how Bitcoin works in order to benefit from it, just like how you donā€™t need to understand how TCP/IP works to use the internet. But do go on, because itā€™s quite interesting, and Iā€™ll make it easy to understand, promise. 

HASH FUNCTIONS 

Letā€™s start with a schematic which Iā€™ll explain below… 

(Image by @jirols_btc)

On the left is the INPUT, the centre is the FUNCTION, and on the right is the OUTPUT. The input can be any data, as long as itā€™s digital. It can be of any size, provided your computer can handle it. The data is passed to the SHA256 program. The program takes the data and from it, calculates a random-looking number, but with special properties (discussed later).

The first SHA (Secure Hash Algorithm) was originally developed by the NSA and there are many different versions now (Bitcoin uses SHA256). Itā€™s a set of instructions on how to jumble up the data in a very complicated but specified way. The instructions are not a secret, and itā€™s even possible to do it by hand, but it is very tedious.

For SHA256, the output is a 256-bit number (not a coincidence). 

A 256-bit number means a binary number 256 digits long. Binary means the value is represented with two symbols, either 0 or 1. Binary numbers can be converted to any other format, for example decimal numbers, which are what we are familiar with. 

Although the function returns a 256-digit binary number, the value is usually expressed in hexadecimal format, 64 digits long.

Hexadecimal means that instead of 10 possible symbols like we are used to with decimal (0 to 9), we have 16 symbols (The ten we are used to, 0-9, plus the letters a, b, c, d, e, and f; which have the values 10 to 15). As an example, to represent the value of decimal 15 in hexadecimal, we just write ā€œfā€ and itā€™s the same value. Thereā€™s plenty of information available online with a quick Google search if you need more elaboration.  

To demonstrate SHA256 in action, I can take the number 1, and run it through an online hash calculator, and I get this output (in hexadecimal): 

The input window is the one above, the output window is the one below.

Note that all computers in the world will produce the same output, provided the input is the same and the SHA256 function is used.

The hexadecimal number output, if converted to decimal, is (notice it takes more digits to write): 

48,635,463,943,209,834,798,109,814,161,294,753,926,839,975,257,569,795,305,637,098,542,720,658,922,315

And converted to binary it is: 

11010111000011010110010011100111111111100110100111111001110000110011101011010111000000001001110111111110101101000111111010101110100011110101101101001001110101010100010001011110001110101001001110000000001111001010010110111011011011110000111010110110100101111010111001101011100110101110011010111001101011100110101110011010111001101011100111 

Just out of interest, here is the same value in base 64 (Iā€™ll let you Google what base 64 means): 

1w1k5/5p+cM61wCd/rR+ro9bSdVEXjqTgDylu28OtpY= 

Note that the smallest possible value SHA256 could return is zero, but the LENGTH is still 256 bits. This is how zero is represented:

0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

And the largest possible value is:

1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

In decimal, thatā€™s:

115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,935

In hexadecimal, it is:

FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

Note there are exactly 64 Fā€™s.

Zero in hexadecimal can simply be written as one single zero, but for hash output, itā€™s 64 of them to keep to the requirement of a fixed size output:

0000000000000000000000000000000000000000000000000000000000000000

Here is a summary of some facts about the hash function that is vital to appreciate:

  • The input cannot be determined from the output 
  • The input can be any length 
  • The output is always the same length 
  • The output will always be reproduced identically if you provide the same input. 
  • Any change to the input, no matter how small, will cause an unpredictable, and wildly different output
  • The output is ā€œseeminglyā€ random, but is actually deterministic (meaning it is calculated)
  • The output cannot be predicted. It can only be calculated, and this takes a measurable amount of work by a computer (and hours with pencil and paper! Donā€™t do it.)

Now that you understand the basic concept of what a hash is, you can understand the explanation of how Bitcoin mining works. 

But before you move on, I recommend you go to an online hash calculator and play with it a little and test for yourself what Iā€™ve said about Hash functions. I like this one.

Mining

I will start by demonstrating a concept of work, which is where ā€œproof-of-workā€ in Bitcoin comes from.

Go to the online hash calculator and type ā€œI am creating 50 bitcoins and paying myself this amount.ā€

Type it exactly, case sensitive, including the full stop. You should get this output:

Now, letā€™s create a rule that says for this payment message to be valid, we need the hash to start with one zero. To do that, we have to change the input somehow. But, as youā€™ve learned, itā€™s not predictable what the output would be for a given input. What modification can we make to ensure a hash starting with zero? 

We have to add data using trial-and-error. But we also donā€™t want to change the meaning of the input message. So, letā€™s create a field (an allocated section) called a ā€œnonceā€ which will hold a nonsense value.

The word, ā€œNonceā€, is supposed to be derived from ā€œnumber only used onceā€, but I don’t see it. 

Notice below how adding this extra field heading, “Nonce:”, changes the hash output.

The output still doesnā€™t start with a ā€˜0ā€™, so letā€™s add some nonsense (I added a meaningless ā€˜xā€™):

It still doesnā€™t start with a zero. I tried some more characters until the hash started with a zero:

There we go. Now, according to the arbitrary rules I set, the text in the input window is a valid imaginary-Bitcoin block with a single transaction paying myself 50 bitcoins. 

Note that Bitcoin blocks are essential ā€œpagesā€ of a ledger. Each block is numbered and creates new bitcoins, along with transactions between users. This is the record, and where bitcoins live.

Now a new rule. For the next block, the hash of the previous block must be included. Iā€™ll add a little complexity and add a few more fields to approach what the real Bitcoin block hasā€¦

The hash starts with an ā€˜fā€™ not ā€˜0ā€™, so Iā€™ll have to try some values in the nonce field:

This time I was luckier and found a nonce after only 4 different tries. For the first block, I found one after 22 tries – there is some randomness here, but generally, itā€™s not too difficult to find a valid hash if all we’re trying to get is one zero. There are 16 possible values for the first hash digit, so I have a 1 in 16 chance that any modification I make to the input field will result in the first hash digit being ā€˜0ā€™.

Note that Bitcoinā€™s fields are like this, but thereā€™s more detail that I havenā€™t added, just to illustrate a point, not necessarily to detail exactly what a Bitcoin block looks like. 

I will add a time field to the next block as I need that to explain the ā€œdifficulty adjustmentā€ next:

Above is block number 3. It includes the previous blockā€™s hash and now Iā€™ve also started to include the time. The nonce I found successfully made the hash start with a zero (I just kept typing a ā€˜1ā€™ until the hash target was met).

Thereā€™s enough here now that I can start explaining a few interesting concepts about the Bitcoin blockchain and mining.

Winning a block

The mining process is competitive. Whoever produces a valid block first gets to pay themselves within that block, as the block is now valid. A miner that produces the same block number a bit later (runner up) gets nothing – that block is rejected. Explaining why that is causes too much of a diversion now, so Iā€™ll explain it in the appendix.

After block 3 is found and broadcasted to everybody (all the Bitcoin nodes), all the miners stop working on block 3 and start working on block 4. The winner publishes the result, and then everyone starts working on block 5, etc. 

With each block, new bitcoins are being created and collectively make up the total supply so far. If there are many miners, then statistically, blocks will be produced faster, and bitcoins will be created faster. Problem, right? 

Seeking a limited supply of bitcoins with a predictable issuance over time, Satoshi Nakamoto thought of this problem and introduced a negative feedback loop to keep block production at 10-minute intervals on average. How? See if you can think of a way. Pause for a moment and ponder, and see if you can come up with the same genius solution, and read on when you give up.

NODES: I mention ā€œvalidā€ blocks. So what? Whoā€™s checking? The Bitcoin nodes are. A Bitcoin node keeps a copy of the blockchain so far and follows a set of rules to check that new blocks are within the rules, and reject those that arenā€™t. Where are the rules? In the code. A computer that downloads the Bitcoin code is a node.

The difficulty adjustment:

The average time of new Bitcoin blocks is calculated by every node every 2016 blocks (this is why the time field is needed). This is part of the protocol and rules that the nodes follow. A formula is applied to adjust the number of zeroā€™s required for each block hash 

Strictly, itā€™s not the number of zeros that is adjusted but a target value the hash has to be below, but thinking of leading zeros is simpler to explain.

If blocks are being produced too fast, then the hash target is adjusted according to pre-defined rules that all nodes follow identically (it’s in their code). 

Keeping it simple for my example, let’s say other people are competing with me, blocks are happening too quickly, and now the 4th block needs two zeros instead of one, according to an imaginary calculation. 

Itā€™s going to take me a bit longer to get two zeros, but weā€™re imagining that there are many other people competing with me so the total time taken for anyone to find a block is kept to a target.

Here is the next block:

Notice the time. More than 10 minutes passed since the previous block (I just made the time up to demonstrate). The 10-minute target is statistical, it is never known exactly when the next block will be found.

I messed around on the keyboard for a minute until two zeros showed up. This was exponentially harder than finding a single zero. The chance of finding two 0ā€™s in a row is 1 in 16×16, or 1 in 256 chance.

If more people join to mine and compete for new bitcoins, then eventually 3 zeros will be required.

I just looked up the last real Bitcoin block, which contains the hash of the previous block. The hash was:

000000000000000000084d31772619ee08e21b232f755a506bc5d09f3f1a43a1

Thatā€™s 19 zeros! Thereā€™s a 1 in 1619 chance of finding such a block with each attempt. Bitcoin miners do many many attempts per second, and collectively all over the world.

The number of attempts per second is known as the ā€œhash rateā€. Currently, the estimated world hash rate is just under 200 Tera hashes per second (200 trillion per second). With that many attempts per second, a block with a hash starting with 19 zeros is found around every 10 minutes. 

In the future, as more miners join in, the hash rate will go up, blocks will be found faster, and Bitcoin’s difficulty will adjust to require 20 zeros etc, which keeps block production around 10 minutes.

The halving:

When Bitcoin first started, 50 bitcoins were produced with every block. The rules of the Bitcoin blockchain specify that after every 210,000 blocks, the reward will be cut in half. This moment is known as ā€œthe halvingā€, and happens roughly every 4 years. The halving, combined with the difficulty adjustment keeping blocks at 10-minute intervals means that around the year 2140, the block reward will be 0.00000001, or 1 Satoshi, the smallest unit of a bitcoin, and canā€™t be halved anymore. Mining wonā€™t stop, but the ā€œblock rewardā€ will be zero. From this moment, no new bitcoins will be created going forward, and the number of bitcoins is mathematically calculatable and close enough to 21 million coins. This is how the total supply is known – it is programmatically set.

The miners will still be rewarded though, not from the ā€œblock rewardā€, but from transaction fees – explained next.

How exactly is the block reward cut in half? It’s in the code held by the nodes. They know to reject any new block after 210,000 where a miner pays himself over 25 bitcoins.

Transaction Fees:

So far Iā€™ve only shown imaginary blocks with a single transaction, the transaction where the miner gets paid a reward. This is called the ā€œcoinbase transactionā€. 

Itā€™s not named after the company, Conbase, I mean Coinbase. The company named itself after the coinbase transaction, not the other way around, donā€™t get confused.

In addition to the coinbase transaction, there are transactions of people paying each other. Hereā€™s an imagined example:

I didnā€™t bother finding a real hash this time (Itā€™s actually the real hash reported in block 200,001). The Nonce I just made up for fun, but notice a message can be embedded there.

Satoshis famously included the words, ā€œChancellor on Brink of Second Bailout for Banksā€ in the first Bitcoin block (The Genesis Block), after the newspaper headline for the day.

The point here is that there are 132 transactions included (not all shown). Look at transaction #132 –  2.3 bitcoins from an address is paying 2.1 bitcoins to another address and also to a second address the amount 0.1 bitcoin (I’ve used dots to shorten the length of the address).

So a source of 2.3 bitcoins pays a total of 2.2 bitcoin (2.2+0.1=2.2). Is there 0.1 bitcoin missing? No, the difference is claimed by the miner, as I’ll explain.

The miner is allowed to pay himself 25 bitcoins as the block reward (because 210,000 blocks have passed so the reward has been halved from 50 to 25). But if you look, the coinbase transaction is 27.33880022. The extra 2.33880022 bitcoins come from other 132 transactions in the block – the inputs will all be slightly greater than the total of the outputs. So the miner gets to create these ā€œabandonedā€ bitcoin as payment to himself.

The block space is limited. When Bitcoin was new, users could send transactions with no fee, and the miners would include the transaction in the block. But now, there are more users, and since getting on the next block is competitive, users include a fee in the transaction to entice the miner to choose their transaction over othersā€™.

So when the block reward steadily goes down, halving every 4 years and eventually to zero, miners still get paid in this way.

Some have suggested that one day the reward to miners will not be enough and will cause Bitcoin to fail. This concern has been thoroughly debunked and I won’t repeat it here.

Can a block be rewritten?

This is extremely unlikely and itā€™s worth understanding why. Youā€™ll then appreciate why Bitcoin transactions are immutable (unchangeable).

I explained earlier that the hash of the previous block is included in the current block. That means any editing of transactions in an old block changes the hash of the block. But that hash is recorded down in the next block, so the next block needs to be updated. But if you change the hash recorded in the next block, then its hash needs to change too.

Note that any time a hash is changed, you lose all these lovely zeros and will just be left with a random-looking hash – and have to do all the work again to get the zeros back. If you do that for the block you tried to edit, you then have to redo the work for the next block, and the next all the way to the most recent block. You canā€™t simply stop at the old block, because the rules of Bitcoin are such that the longest chain of blocks is the real Bitcoin record. If you go back and edit a block 10 blocks ago, you no longer have the longest chain. You have to add 10 more blocks and then a bit more because as you were creating those 10 blocks, the real chain probably became a bit longer. You have to race to overtake the real chain. If successful, then the new version becomes the real version.

Repeating the entire worldā€™s collective hashing effort from the edited block to the latest block is the barrier to editing Bitcoin. The energy was expended to create those hashes with all those improbable zeros, and that energy expenditure must be repeated to edit Bitcoin. This is why energy used to mine Bitcoin is not ā€œwastedā€; it is there to defend Bitcoin from edits; to make the ledger immutable without needing to trust a central authority.

What happens if 2 miners find a block at the same time?

This actually happens every now and then, and it always sorts itself out as follows:

Every node will receive either one of the new nearly-simultaneous blocks first and will accept that one, and reject the one arriving later. This results in a split of the network, but itā€™s temporary.

To illustrate, let’s call one of the blocks blue and the other red (they have no colour, just bear with me)

Miners then work on the next block, but there will be a split as to which block they extend the chain from.

Letā€™s say the winning miner found a block using the blue chain. They will send the new block to all the nodes and the longest chain will be apparent. The nodes that had accepted the red chain will then drop it, and adopt the blue chain. 

All miners that were working on the red chain will stop, and will now work on the longer chain which is the blue chain. The red chain died.

Appendix

Why a runner-up minerā€™s block is invalid

Suppose block 700,000 just got mined by MINER-A. 30 seconds later, MINER-B also created a different version of block 700,000. When MINER-B broadcasts this alternative, every node is going to reject it because they have already seen and accepted the block by MINER-A. Whatā€™s more, in those 30 seconds, letā€™s say that MINER-C found block 700,001. Given that MINER-Bā€™s competing 700,000th block does not extend the current chain (which is up to 700,001), it is also rejected for that reason.

Even more interesting is that if MINER-B had been working on Block 700,001 instead of a competing version of 700,000, they would have had just as much chance of mining a valid block than the invalid one. So as soon as any miner sees a new block, they should set their effort on the next block.

If however, Miner-B found block 700,000 1 second after MINER-A did, then itā€™s possible that some nodes see MINER-Aā€™s block first, and some see MINER-Bā€™s block, depending on geographic locations and internet speeds. In that case, there is a temporary fork, and some miners will be working to extend one version, and some miners will be working to extend the other. As explained earlier using the ā€œblue chainā€ and ā€œred chainā€ descriptors, eventually one of the versions will extend further before the other and become the valid version unanimously. 

Tips:

Static Lightning Address: dandysack84@walletofsatoshi.com


On-chain or Lightning