sha256
Running the message abc
thru the SHA-256 hashing algorithm results in the following string:
So what is this giant string? What are the components?
Here’s what we can quickly observe by a visual scan:
- all characters are hex values:
0-9a-f
- the total length is 64 characters
- we know that each hexadecimal digit represents four binary digits
- the 256 in the name is the size of the resulting hash:
64 * 4
So we can conclude that this string is just a hex representation of a 256 bit number. How can we get a different look at this number?
We can use the Scala BigInt
class to view the base 10 and also the raw binary (aka base 2) representation of this hashed value:
How is this number generated? For that we can use some Java built-in libraries like MessageDigest
:
This MessageDigest class provides applications the functionality of a message digest algorithm, such as SHA-1 or SHA-256. Message digests are secure one-way hash functions that take arbitrary-sized data and output a fixed-length hash value.
https://docs.oracle.com/javase/7/docs/api/java/security/MessageDigest.html
So what do we have here? The output of the SHA-256 algorithm in this instance is an Array of bytes.
hash.size
is 32, which tells us that each array item represents 8 bits.
mapping to a hexString, hash.map(_.toHexString).mkString
, leaves us with something that looks sorta like the official hash that I began the post with, but with a bunch of extra f
characters. As a reminder, f
in hex equals 1111
in binary and 15
in base 10.
This is because there are some “negative” bytes in the array. Here’s a great answer I found on stack overflow to explain what that means:
byte in Java is a number between −128 and 127 (unsigned, like every integer in Java)… By anding with 0xff you’re forcing it to be a positive int between 0 and 255.
https://stackoverflow.com/questions/9949856/anding-with-0xff-clarification-needed
So let’s map the array again, applying the and
transformation and also padding an extra 0
char in the case of single digit hex values.
And there’s our fully hashed value.