Imagine you could take any piece of data — a single word, a novel, a 4 GB video — and reduce it to a short, fixed-length string of characters that uniquely identifies it. Change a single comma in the original and the string becomes completely different. And there is no way to reverse the process to recover the original data.
That is exactly what a cryptographic hash function does. It is one of the most fundamental building blocks of modern security, and it quietly protects nearly everything you do online.
The fingerprint metaphor
A hash works like a digital fingerprint. Your fingerprint uniquely identifies you, but no one can reconstruct your face from it. Similarly, a cryptographic hash uniquely identifies a piece of data without revealing what that data is.
Feed any input into a hash function and you get a digest (also called hash value or checksum) — a fixed-length string of hexadecimal characters. SHA-256, one of the most common algorithms, always produces a 64-character output regardless of input size.
| Input | SHA-256 digest (first 16 chars) |
|---|---|
Hello |
185f8db32271fe25... |
hello |
2cf24dba5fb0a30e... |
| Entire Wikipedia dump | (still 64 characters) |
The four essential properties
A function qualifies as a cryptographic hash only if it satisfies these properties:
- Deterministic. The same input always produces the same output, on any machine, at any time.
- One-way (pre-image resistance). Given a hash, it is computationally infeasible to find the original input. You cannot "unhash" data.
- Collision-resistant. It is practically impossible to find two different inputs that produce the same digest. For SHA-256, the odds of an accidental collision are roughly 1 in 2^128 — a number larger than the atoms in the observable universe.
- Avalanche effect. Changing a single bit in the input flips roughly half the bits in the output. There is no detectable pattern or relationship between similar inputs.
Key distinction: Hashing is not encryption. Encryption is reversible with a key; hashing is deliberately irreversible. You encrypt data to protect it during transit. You hash data to verify its integrity or store a proof without keeping the original.
Common hash algorithms
MD5 (1991)
- Output: 128 bits (32 hex characters)
- Status: Cryptographically broken. Researchers can generate collisions in seconds on a laptop.
- Still seen in: Non-security checksums for file downloads and legacy systems.
SHA-1 (1995)
- Output: 160 bits (40 hex characters)
- Status: Deprecated for security. Google demonstrated a practical collision in 2017 (the "SHAttered" attack).
- Still seen in: Older Git repositories, some legacy certificates.
SHA-256 (2001)
- Output: 256 bits (64 hex characters)
- Status: Current standard. No known practical attacks.
- Used in: TLS certificates, Bitcoin, password hashing, digital signatures, file integrity verification.
SHA-3 (2015)
- Output: Variable (commonly 256 bits)
- Status: Latest standard, based on an entirely different internal design (Keccak sponge construction) from the SHA-2 family.
- Used in: Forward-looking systems, Ethereum, situations requiring algorithm diversity.
| Algorithm | Output size | Secure? | Speed |
|---|---|---|---|
| MD5 | 128 bits | No | Very fast |
| SHA-1 | 160 bits | No | Fast |
| SHA-256 | 256 bits | Yes | Moderate |
| SHA-3-256 | 256 bits | Yes | Moderate |
Where cryptographic hashing is used
Password storage
When you create an account, a well-designed system never stores your password in plain text. It stores the hash. When you log in, the system hashes what you type and compares it to the stored value. Even if the database is breached, attackers get hashes — not passwords.
Modern systems go further by adding a random salt (extra data appended before hashing) and using intentionally slow algorithms like bcrypt, scrypt, or Argon2 to make brute-force guessing impractical.
File integrity verification
Software distributors publish SHA-256 checksums alongside their downloads. After downloading, you compute the hash of the file on your machine and compare. A match proves the file was not corrupted or tampered with during transit.
Digital signatures
When you digitally sign a document, the system hashes the document first, then encrypts the hash with your private key. The recipient decrypts it with your public key and compares it to their own hash of the document. This is far more efficient than encrypting the entire document and proves both authorship and integrity.
Blockchain
Each block in a blockchain contains the hash of the previous block, creating an immutable chain. Altering any past transaction changes its block's hash, which breaks the chain from that point forward, making tampering immediately visible.
Why MD5 and SHA-1 are considered broken
A hash algorithm is "broken" when someone can deliberately create two different inputs that produce the same hash (a collision). This undermines every use case that relies on uniqueness.
- MD5: Collisions can be generated in seconds. Researchers have created two different PDF files with identical MD5 hashes.
- SHA-1: The SHAttered attack in 2017 produced two different PDFs with the same SHA-1 hash, requiring about 6,500 years of single-CPU computation (feasible with cloud resources).
Practical rule: Never use MD5 or SHA-1 for anything security-related — passwords, certificates, digital signatures, or integrity checks where an adversary might be involved. Use SHA-256 or SHA-3 instead.
Going further
Hashing is one of those concepts that becomes intuitive once you experiment with it. Try hashing a sentence, then change one character and observe the avalanche effect firsthand.
- How to Generate and Verify Hashes — step-by-step tutorial
- Hash Generator — compute SHA-256, MD5, SHA-512, and more instantly in your browser
- Hash Identifier — paste an unknown hash and identify its algorithm