Globefin.org

Section 01

The fingerprint idea

Module 01 introduced three jobs cryptography does — confidentiality, integrity, authenticity — and named the tool behind integrity: the hash. We start the primitives here because the hash is the simplest of them, requires no notion of keys or secrets, and reappears at the heart of almost everything later in the track, including the blockchain.

The intuition is right there in the name we will use throughout: a hash is a digital fingerprint. A human fingerprint is a small, fixed-size thing that uniquely identifies a whole person: it is far smaller than the person, you cannot reconstruct the person from the fingerprint, and no two people share one. A hash does exactly this for data. Feed in any data — a single word, a legal contract, a film, an entire bank's transaction history — and the hash function produces a short string of fixed length that serves as that data's unique fingerprint. The data can be gigabytes; the fingerprint is always the same modest size.

Here is the whole idea in one picture: data of any size goes in, a fixed-size digest comes out, and the door only swings one way.

A hash function, then, is just a process that takes an input of any size and deterministically produces this fixed-size fingerprint, often called a digest. It is not encryption — nothing is being hidden and nothing will be unscrambled later — and that distinction matters, because beginners constantly confuse the two. Encryption is a locked box you intend to open again; hashing is a fingerprint you never intend to reverse.

What a hash is

A hash function takes any data and produces a short, fixed-size fingerprint (a digest) of it. Like a human fingerprint, it is far smaller than the original, you cannot reconstruct the original from it, and no two inputs realistically share one. It is not encryption — nothing is hidden to be revealed later; it is a compact, reliable stand-in for data.

Section 02

A tiny hash you can do by hand

Before the real thing, let us demystify what a hash function actually does with a toy version simple enough to compute on paper. Real hash functions are far more elaborate, but the toy shares their essential shape, and doing one by hand removes the mystery.

Here is our toy hash. Take a word, convert each letter to its position in the alphabet (a=1, b=2, …, z=26), add those numbers up, and take the remainder when divided by 97. That remainder — a number from 0 to 96 — is our fingerprint. Watch it work on the word "cat":

letter	c	a	t	sum	÷ 97 remainder
value	3	1	20	24	24

So our toy hash of "cat" is 24. Notice it already shows the core properties. It is deterministic: "cat" will always give 24. It is fixed-size: the answer is always a number from 0 to 96, whether you hash "cat" or the entire dictionary. And it is somewhat one-way: if I tell you the fingerprint is 24, you cannot recover the input — "cat" gives 24, but so do countless other inputs (any letters summing to 24, or 121, or 218…). That last point also reveals why the toy is weak: those easy-to-find matches are collisions, and a real hash must make them practically impossible to find. Try it yourself:

Live demo · toy hash

Compute the toy hash (sum of letters, mod 97)

Type letters only. Watch the running sum and the fingerprint. Try "cat", then "act" and "tac" — same letters, same fingerprint. That is a collision.

letter values: 3 · 1 · 20 sum: 24 mod 97: 24

toy fingerprint → 24

This toy is a real (bad) hash function, and feeling it work is the point: a hash is just a procedure that crushes any input down to a fixed-size value in a way you cannot run backward. The genius of a cryptographic hash is doing this so thoroughly that the four properties below all hold at once — which our toy badly fails, since its collisions are trivial to find and a one-letter change barely moves the result.

Section 03

The real thing — and the avalanche

Now meet a real cryptographic hash. The workhorse of modern finance is SHA-256 — the function that secures Bitcoin. It takes any input and produces a 256-bit digest, which we write as 64 hexadecimal characters (each hex character stands for 4 bits). Here is the genuine SHA-256 of the word finance:

Real SHA-256

"finance" →

computing…

"Finance" (one capital F) →

computing…

Changing one letter from lowercase to capital produces a completely different fingerprint — not a slightly different one. There is no resemblance between the two. That is the avalanche effect.

This is the property that makes a hash useful for integrity, and it deserves its name: the avalanche effect. Change the input by the tiniest amount — one character, one comma, a single capital letter — and the fingerprint changes completely and unpredictably. A tiny ripple in the input causes an avalanche in the output. It means there is no such thing as a "small" change that produces a "small" change to the fingerprint: any alteration whatsoever, however trivial, yields a totally different digest, so there is nowhere for a change to hide.

Do not take it on faith — make it happen yourself. Type into the box below and watch the real SHA-256 digest recompute live. Change a single character and watch the entire fingerprint transform. The characters that differ from the previous fingerprint are highlighted, so you can see how thoroughly one keystroke scrambles the whole output:

Live demo · real SHA-256 playground

Type anything — watch the real fingerprint

This computes genuine SHA-256 in your browser. Try a sentence, then change one letter. Notice the length never changes, no matter how much you type.

SHA-256 →

input length: 21 chars digest length: always 64 hex chars —

Two things should jump out as you type. First, the digest length never changes — one character or a whole paragraph, always 64 hex characters. That is the fixed-size property, seen live. Second, the smallest edit detonates the whole fingerprint. Hold onto that feeling: it is the entire reason a hash can guard integrity, store passwords safely, and chain a blockchain together — all of which we now build on.

Section 04

The properties, now that you've seen them

Having felt the hash work, we can name its properties precisely — each one now something you have observed rather than been told. A cryptographic hash has four "can-do" properties and two crucial "cannot-do" properties.

The four things it does: it is deterministic (same input, same digest, every time, anywhere — you saw "cat" always give 24); it is fixed-size (any input, same-length output — you saw SHA-256 stay 64 characters no matter what you typed); it is fast (the playground recomputed instantly on every keystroke); and it shows the avalanche effect (one character changed the whole digest). These are what make it a convenient, reliable fingerprint.

The two things it cannot let you do — and these are what separate a cryptographic hash from our weak toy:

One-way (irreversible). Given data, computing its digest is instant; given only a digest, there is no practical way to work backward to the data. The fingerprint reveals essentially nothing about its input. Like a fingerprint at a crime scene: it confirms a match if you already have the suspect, but you cannot grow the person back from the print.
Collision-resistant. A "collision" is two different inputs with the same digest. Because digests are fixed-size and inputs unlimited, collisions must exist in theory — our toy had them everywhere — but a strong hash makes them so astronomically hard to find that, in practice, every input has its own unique fingerprint. No one can feasibly discover two different documents with the same SHA-256 digest.

Put the two "cannot-do" properties together and you have the foundation of every security use that follows. Irreversibility means a digest can be stored or published without revealing the data behind it — the key to safe password storage. Collision resistance means a digest is a trustworthy unique identifier for its data — the key to integrity, signatures, and blockchains. The next sections put each to work.

Six properties, two that matter most

Can-do: deterministic, fixed-size, fast, avalanche-sensitive. Cannot-do: one-way (no recovering input from digest) and collision-resistant (no finding two inputs with the same digest). The two "cannot-do" properties are what make a cryptographic hash more than a checksum — and what power password storage, integrity, and the blockchain.

Section 05

Use one: storing passwords safely

The one-way property does something clever. The question: how should a bank store your password so it can check it at login — without keeping a copy that could be stolen?

Storing passwords directly is a catastrophe waiting to happen: if the database is breached — and databases are breached constantly — every password is exposed at once. The hash provides the escape. Instead of your password, the service stores only the digest of your password. At login it hashes what you type and compares digests. Match means the password was correct — yet the service never stored, and need not know, the password itself. Because hashing is one-way, a stolen database of digests cannot be turned back into passwords.

There is a wrinkle, and you can see it live. If two users pick the same password, naive hashing gives them the same digest, and attackers can precompute digests of common passwords and look them up. The defense is salting — mixing a unique random value into each password before hashing, so identical passwords get different digests. Type a password below, then add a salt, and watch identical passwords diverge:

Live demo · salting

Why identical passwords need different fingerprints

Both users typed the same password. Without salt, identical digests (an attacker's dream). Add a unique salt to each and the digests diverge completely.

password

add a unique salt per user

user A digest:

user B digest:

Verify without storing

Services store the fingerprint of your password, never the password itself. At login they hash what you type and compare fingerprints — so a breached database of digests cannot be reversed into passwords. A unique random "salt" per password makes identical passwords produce different digests and defeats precomputed lookup attacks.

Section 06

Use two: making signatures practical

Here is a use that connects the hash to the rest of the track. Module 01 introduced the digital signature, which Module 05 covers fully. The hash is what makes signatures practical.

Signing data cryptographically is a relatively slow, heavy operation, and doing it directly on a large document would be cumbersome. The elegant solution used everywhere: instead of signing the whole document, you sign its fingerprint. You hash the document down to its short digest and apply the signature to that small, fixed-size value. Because the fingerprint uniquely represents the document (collision resistance) and changes completely if the document changes at all (the avalanche effect), signing the fingerprint is as good as signing the entire document — any tampering would change the fingerprint and instantly invalidate the signature.

This is the recurring pattern you will see throughout the track: the hash acts as a compact, faithful proxy for data, so operations that are expensive or awkward on large data can be done cheaply on the small fingerprint instead, with no loss of security. Sign the fingerprint, not the file. Compare the fingerprint, not the file. Link the fingerprint, not the file. That last one — link the fingerprint — is exactly how a blockchain is built, which is where we go next, with a working one you can break.

Section 07

Use three: the blockchain — break it yourself

The most famous use of hashing is the building block of a blockchain — and you can now understand it completely, because it uses only the hash properties you have already seen. We will devote a full module to blockchains later; here is the core, made tangible.

Keep a ledger of transactions so that no one can secretly alter the past. The trick: group transactions into blocks, and include in each new block the fingerprint of the previous block. Because that previous fingerprint depended on its contents — which included the fingerprint of the block before it — each block's fingerprint effectively depends on the entire history before it. The blocks are chained by fingerprints. Hence "blockchain."

Now see why this makes the past tamper-evident. Below is a small, working blockchain: each block shows its data, the previous block's fingerprint it carries, and its own computed fingerprint (all real SHA-256, computed live). Every block is green and valid. Now edit the data in any block — change an amount, say — and watch what happens: that block's fingerprint changes (avalanche effect), so it no longer matches the "previous fingerprint" stored in the next block, and every block from there to the end turns red. To hide the change, an attacker would have to recompute every single block afterward. That is tamper-evidence, and it is built entirely from hashing.

Live demo · tamper the blockchain

Edit any block's data and watch the chain break

Each block links to the one before it by storing its fingerprint. Change any block's data — the break cascades to every block after it. Real SHA-256, recomputed live.

Everything you learned is doing work in that one mechanism: deterministic (everyone computes the same fingerprints), avalanche (any edit transforms a block's fingerprint), and collision-resistant (an attacker cannot craft different data with a matching fingerprint). Hashing alone does not make tampering impossible — a determined attacker could recompute the whole chain — but it makes any change instantly detectable and enormously expensive to cover up. The later module on consensus shows how a network makes that recomputation practically hopeless. For now, notice: a blockchain is not something that merely uses hashing — the chain literally is a structure of linked fingerprints.

⛓️ What you just did

By editing one block and watching the chain cascade to red, you saw why a blockchain's history is hard to forge — the same reason it underpins Bitcoin. Each block carries the previous block's fingerprint, so altering any past data breaks every block after it, and only the avalanche effect and collision resistance of the humble hash make that possible. You understood the structural heart of a blockchain before we ever opened the dedicated module — purely from the fingerprint.

Section 08

When hashes break — and what's next

A brief, honest note on limits. A hash's security rests on those "cannot-do" properties holding — above all, that no one can find collisions. But hash functions are human-designed, and over time clever attacks or sheer growth in computing power can weaken a given function until collisions become findable. When that happens, the function must be retired.

This has happened. Older functions once trusted across finance and the web — known as MD5 and SHA-1 — were gradually broken as researchers learned to produce collisions, and they have been deprecated and replaced. The lesson is not that hashing is unreliable but that specific functions have lifespans, and security means migrating to stronger ones as older ones age. Today's workhorse is the SHA-2 family, and in particular SHA-256 — the one you have been playing with, and the one that secures Bitcoin, fingerprinting its blocks and underpinning its mining.

With the hash in hand, the track moves forward. The hash gave us integrity, but it does nothing for confidentiality — it is a fingerprint, not a lockbox, and reveals (and hides) nothing reversible. To keep data secret rather than merely tamper-evident, we need a different tool: encryption. The next module takes up the oldest and most intuitive form, symmetric encryption — scrambling data with a shared secret key so only someone with the same key can unscramble it — and it will run straight into a problem the hash never faced: how do two strangers agree on a shared secret in the first place? That problem is the doorway to the most important idea in the track.

Next module

Module 03 · Symmetric Encryption — The Shared Secret

From fingerprint to lockbox. The oldest, most intuitive encryption: scrambling data with a secret key so only someone holding the same key can unscramble it — with live demos, as always. How it delivers the confidentiality the hash cannot, why it is the workhorse securing data at rest and in transit, and the awkward problem it creates — sharing the secret key with someone you have never met — that motivates the breakthrough to come.

Self-examination

Test your understanding

Six questions on hashing — what a hash is, its properties, and its uses in integrity, passwords, signatures, and blockchains. The questions test the concepts you just saw in action.

Hashing — the digital fingerprint