In my last post about blockchain, I mentioned how hashes are used to link and secure a blockchain. But the question itself is how does hashing work? I briefly answered it in the post but here is the full explanation.
A hash function is any function that can be used to map data of arbitrary size to fixed-size values. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes.
So to clarify, hashing is just a technique used to create a seemingly random value from a bunch of data so that it is the same every time we put the same data in. All of this is done through a mathematical algorithm. These algorithms are called hashing algorithms. Now that is out of the way let's get to what makes a good hashing algorithm.
What makes a good hashing algorithm?
A good hashing algorithm needs to exhibit a few characteristics to be considered secure and efficient. Some of the characteristics are:
This is a one-way function, what it means is that you can very easily derive the hash from data. But, you cannot derive the data from a hash. This is very important as a hash is just to make sure that the data hasn't been tampered with. It is like a fingerprint that is unique to the specific and unique data.
Being deterministic means that if we give the same input to the function we will always get the same output. This is so because we need to be able to verify the authenticity of the data at any end, be it the server or the client. For this, we need the hash to reproduceable. Therefore, this requirement.
It has to be fast to compute. The hashing algorithm needs to be very very fast to compute because in the fast-moving world of today. We need to compute thousands if not millions of hashes in seconds. For example, in the case of Bitcoin mining, miners need to create thousands of hashes in seconds to solve a puzzle.
The Avalanche Effect
The Avalanche Effect means that even if a small change, even for a bit, occurs then the hash will be absolutely different. This is important because we need the hash to be unpredictable. No one should be able to predict the hash based on the given value. This is again very important for the cryptographic puzzle in bitcoin mining. But, that is a topic for another day.
Must Withstand Collisions
Because the number of chars we have in 64 chars is finite, we need the algorithm to be able to withstand collisions. But the thing is that it is very very rare and unlikely to happen. But, we can deal with it and it is not really a problem.
The problem that does arise is that we shouldn't be able to create artificial collisions because the collisions can enable someone to be able to forge documents and other malicious things.
The SHA-256 Algorithm
The SHA-256 is a state-of-the-art algorithm that was developed by the NSA in 2001. It has seen a lot of use lately. It is used to secure and verify a lot of data in the current digital world. It has not been cryptographically broken yet. A hash only 256bits of the data, therefore balancing both the space and efficiency.
Hopefully, you now know a lot more about the basic fundamentals of hashing. If you want to create them using python then you can refer to my old article here.
See you in the next one!