0 - Blockchain: The Basics

Go from not knowing how blockchains actually work to understanding the fundamentals that make blockchain so powerful.

0 - Blockchain: The Basics
Photo by Shubham Dhage / Unsplash

In this blog, you will go from knowing nothing about blockchains to understanding the fundamentals of how blockchains work. Keep in mind that this is not for someone that wants to learn to use blockchains, but this blog is for people who want to learn how blockchains work.

Some linked websites are NOT designed for mobile. Using a device with a bigger screen is recommended, but not required.

What are Blockchains?

Blockchains are decentralized networks of computers that keep a public ledger of transactions. Ok, what does that even mean? Well, the ledger is just a big list of every transaction (e.g., buying a subscription). But this ledger by itself is not very safe. Imagine someone hacking the ledger to give themself $1,000,000. To prevent this, the ledger of transactions is verified by a decentralized network of computers. If someone would want to hack our blockchain, they would need to hack over 50% of the entire network.

I have simplified this so we will go more in-depth on everything later, so don't worry if you don't understand everything yet. The first thing we will understand is the ledger.

🧐
Keep in mind that many people use the word blockchain to describe both the decentralized nature and the ledger, but this is technically incorrect. Just the word "blockchain" has become popularized instead of "decentralized blockchain" because it is simpler. "Blockchain" just refers to the ledger. You should try to use "decentralized blockchain" when applicable, but many will often use these different meanings interchangeably.

How do Blockchains (The Ledger) Work?

In this section, the word "blockchain", just refers to ledger. I also hope to give you a better understanding of what a blockchain actually is. This part might take some time to understand, so don't feel bad about re-reading it.

This section will be split into many smaller subsections for understanding. It is important to understand the previous subsection before moving on.

Hashes

Blockchains could not exist without hashes (no, not hash browns). These hashes use cryptography to convert an input to a completely random output. Most hashing functions will have fixed-length outputs meaning that no matter the input size, the output size will stay the same. For example, the word "hello" which makes a 64-character long hash and even if you hashed an entire dictionary, it would still make a 64-character long hash. Play around with the demo below to gain a better understanding.

Blockchain Demo
A live blockchain demo in a browser.

You will notice that when you add anything, it will generate an output that seems completely random. This is mostly correct, but an important property of hashing functions is they will always produce the exact same output for each input. Hashes are random BUT consistent. Try typing the sentence below and you will see the same outputted hash.

Hash for "This is some random data!"

You may have noticed that we are using a specific hashing function called SHA256. This is one of the most popular hashing functions and is used in Bitcoin. SHA256 will always create an output that is 64 characters long and hexadecimal. There are way too many hashing functions to list, but here's an extensive list from Wikipedia. The other very popular blockchain (as in decentralized and ledger) Ethereum, uses the Keccak256 hashing function. It is very important to use the same hashing function across the entire blockchain because the same data will have different outputs with different hashing functions. For blockchains to work, every input must have the same output, so it would be bad to use 2 different hashing functions throughout the blockchain.

Blocks

A block is really just a piece of data in the blockchain. Other than the data, it just stores some metadata to make it work in the blockchain, nothing else. This includes a block number, a nonce, and a mine button (which we are about to talk about). Play around with the demo below.

Blockchain Demo
A live blockchain demo in a browser.

The first difference you will notice is the block number. In our blockchain, every block has a number also known as the height of the block. The block number is always 1 more than the previous block and is used to determine the order of blocks. The first block is block 0, the second is 1, and so on. Though the block number determines the order, it cannot be used to uniquely identify the block. This is due to an advanced topic where 2 blocks are created with the same block number called forking. Forking is quite interesting because it can happen on purpose for a multitude of reasons, but forking will be covered later in this blog series.

When looking at the default hash, you will notice that it starts with 4 zeros.

But, when we change any of the data to make the hash start with something else, the background turns red.

From this, we can determine that the rule is the hash MUST start with 4 zeros. Does that mean we can only have data that creates a hash that starts with 4 zeros? No, this is where nonces come in! Nonce stands for a number used once. You can hear the pronunciation below (from Wiktionary).

audio-thumbnail
How to pronounce nonce
0:00
/0:01

To find a hash that starts with 4 zeros with our data, we can guess different nonces until the hash output starts with 4 zeros. The nonce allows us to create a completely different hash without changing our delicate data. The way we do this is to start with a nonce of zero, calculate the hash, and if the hash doesn't start with 4 zeros, repeat. Our computer will go through these steps, brute-forcing through nonces until the hash starts with 4 zeros. Because hashes are completely unpredictable (but always consistent) it could take 1 second or 1 month to find the nonce by brute-force guessing.

This system of having a nonce is called proof of work (PoW). As the name implies, this means we are proving we did the work to find the correct nonce. The process of guessing the nonce is called mining. Whenever someone is mining a cryptocurrency, they are really just guessing these nonces. Remember that hashes make it nearly impossible to find what input created that hash, but it is extremely easy to check if the input actually created that output. In our block, this means it is very hard to find the nonce, but it is very easy to check if the nonce is correct.

You may have noticed that pressing the "mine" button found a nonce that followed our 4 zeros rule pretty quickly. But, this makes the nonce useless because we don't need to work very hard to find the nonce. This is where difficulty comes in. With just 4 zeros, the difficulty is not very high so mining the block is not the hard. But, most blockchains have a difficulty that is dynamic which means that the difficulty can go higher or lower. The Bitcoin blockchain specifically adapts the difficulty so a blocked is mined about every 10 minutes. The higher the difficulty is, the harder it is to maliciously mine your own fake block. It is important that our difficulty adapts so malicious actors have a hard time modifying the blockchain, but it doesn't take too long for our good computers to find the nonce. With our example, we can increase out difficulty by requiring that the hash starts with 5 zeros or 6 zeros. In other blockchains, they have specific difficulty implementations where the difficulty changes can be more granular.

Ok, but why do this? Isn't this just a waste of time?

No! The reason we do this is to deter malicious people from trying to hack our blockchain. If we have 100 computers finding the nonce for the most recent block, then the block will be created quickly. But, if there is 1 malicious computer that wants to change a block or make its own data, it will have to guess the nonce all on its own which will take a while. PoW makes changing or adding new data extremely difficult for malicious actors, meaning that the blockchain is more secure.

The Blockchain (the Ledger)

The part you have been waiting for, THE BLOCKCHAIN! Blockchains are just as they sound, they are a chain of blocks. These blocks are mathematically linked using our hashing function and proof of work. Here's a demo to play around with.

Blockchain Demo
A live blockchain demo in a browser.

The first thing you will notice is the "Prev" field that stands for the previous hash. It is pretty self-explanatory, the previous hash will be the hash of the block before it. You probably will have noticed that the first block has a previous hash of just zeros. That is because the first block is the Genesis Block.

The Genesis Block of any blockchain is always the first block. This makes the Genesis Block very special. Because there is no previous block, the Genesis Block can have previous hash of anything, it doesn't matter (in this case it is all zeros).

Other than these differences, the blockchain doesn't seem very special but don't be fooled. You can think of the blockchain as just a metal chain, where each link is a block of data.

If you try to swap out any of the links or add a new one, you just can't. You can only add a new link to the bottom. You also can't cut out any of the links, or else the rest of the chain falls apart.

This functionality is translated to our blockchain through our hashes and previous hashes. An important way our blockchain blocks work is that the block's hash includes the previous hash. So, if we change data in the second block, our second block's hash will change, causing the third block's previous hash to change, causing the third block's hash to change, causing the fourth block's previous hash to change, and so on. Because the current block is linked to the previous block, if the previous one changes, the current block changes. This functionality causes a difference in the second block to have a chain reaction down the entire blockchain making every following block invalid. In this example, if the chain was 20 blocks long, the malicious actor would need to find the nonce for all 19 newly invalid blocks (which would take forever).

This effect does not just apply to changes, but to deletions or additions in the middle. This is why you will hear many call blockchains immutable, which means they can't change. But, some may wonder, "Why can't I just change the block but not change the hash? Then I won't have to brute-force finding the nonce, right?" This is great security-minded thinking and will be answered very soon in a later section of this blog. A short answer, for now, is that other computers are always checking your blockchain to see if it is invalid.

The Network (Decentralization)

Now that we know how the ledger works, we will talk about the network of computers. The blockchain just by itself can be changed if only run on one computer. But, if we run the blockchain app across thousands of computers, each computer will enforce the rules of the blockchain. Here's yet another demo to experiment with.

Blockchain Demo
A live blockchain demo in a browser.

So these blockchains are just the same as before, but now there is redundancy because the blockchain is copied over many different computers. If a computer wants to add a new block, it has to mine the block, then send it to the rest of the computers. When a computer receives a block, it will validate it and only if it is valid, the block will be added to the local blockchain.

Validation

Whenever a node in the network gets a block or blockchain, it will run validation to make sure that it can be used. These validation steps are pretty simple, here's a list of them below.

💡
Node/Peer - A computer in the decentralized blockchain network.
  • Block #'s are in sequential order (1, 2, 3...)
  • A block's previous hash is the hash of the previous block
  • The hash of the block should be the calculated hash
  • The hash should follow the difficulty rule (starts with 4 zeros)

Keep in mind though that these validation steps only apply to the simple blockchain model we have discussed. When we get into transactions (in the next blog) one of the new validation steps is making sure that a user has more money than they try to spend.

Sending Blocks

When a new computer joins the blockchain, they have to sync up with the rest of the network. This new computer will request the blockchain from multiple nodes to get a local copy. It is important to ask multiple nodes to prevent 1 malicious node from sending you an incorrect version of the blockchain.

Every time a new block is created, it gets sent to the rest of the network. But, a single node cannot send the new block to a thouand computers, so what do we do? Well, the node will send it to all of the other nodes it is connected to (usually 5-10). Then those nodes will send the new block to the nodes it is connected to. In essence, the new block is being rippled across the network.

Conclusion

And that's it! You should now know some basic fundamentals of how blockchains actually work. You should understand hashes, blocks, blockchains, decentralization, and validation. Look at all that new knowledge! In the next blog, we will discuss how transactions work on the blockchain.

Hope you enjoyed it!