md5sum and sha256sum: Verifying File Integrity

md5sum and sha256sum: Verifying File Integrity

What You'll Learn

  • Why checksums (hashes) exist and what problem they solve
  • How to compute a file's fingerprint with sha256sum / md5sum
  • How to use -c to verify whether a file is corrupted or swapped out
  • The difference between corruption detection and tamper detection, and why MD5 is discouraged today

Quick Summary

  • Just want a fingerprint → sha256sum file
  • Check it matches the official value → sha256sum -c SHA256SUMS
  • For tamper protection use SHA-256. MD5 / SHA-1 are only for accidental corruption

Environment

  • OS: Ubuntu / typical Linux
  • md5sum / sha256sum ship with GNU coreutils and are preinstalled (no install needed)
  • Relatives include sha1sum / sha512sum / b2sum

1. What Is a Checksum?

Conclusion: A checksum is a "fingerprint" computed from a file's contents; changing even one bit changes it drastically.

Lina: Senpai, download pages sometimes show a long string like "SHA256: a1b2c3...". What is that?
Linny-senpai: That's a checksum, also called a hash value. It's like a "fingerprint" computed by reading the entire contents of a file. The same contents always produce the same value.
Lina: A fingerprint... so it's different for every file?
Linny-senpai: Exactly. And if even a single bit changes, the fingerprint becomes a completely different value. That lets you instantly check whether a downloaded file is identical to the original.

What checksums let you do

  • Detect whether a downloaded file got corrupted in transit (corruption detection)
  • Confirm the contents match the source (detecting swaps / tampering)
  • Compare whether two files are byte-for-byte identical

2. Computing a Fingerprint with sha256sum

Conclusion: sha256sum file computes the hash; output is "hash + two spaces + filename".

2-1. Basics: Fingerprint a Single File

$ sha256sum ubuntu.iso
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855  ubuntu.iso
Lina: Whoa, a really long string came out, with the filename after it.
Linny-senpai: That's the SHA-256 fingerprint — 64 hexadecimal digits. If it matches, the contents are considered identical. Note there are two spaces in between; that matters later when verifying.

How to read the output

e3b0c4...b855  ubuntu.iso
└── hash ────┘└┘└ filename
              two spaces
  • The first space is a separator
  • The second character marks the input mode ( = text / * = binary)
  • On GNU systems both modes are identical (kept for historical compatibility)

2-2. md5sum Works the Same Way

$ md5sum ubuntu.iso
d41d8cd98f00b204e9800998ecf8427e  ubuntu.iso

md5sum computes MD5 and sha256sum computes SHA-256, but the usage is identical. The same goes for sha1sum / sha512sum.

3. Multiple Files and Saving a List

Conclusion: You can hash many files at once and save the list with > to reuse it later with -c.

3-1. Hash Several Files at Once

$ sha256sum *.iso
e3b0c4...b855  ubuntu.iso
9f86d0...0a08  debian.iso

3-2. Save the List to a File

$ sha256sum *.iso > SHA256SUMS

The name SHA256SUMS is a common convention on distribution sites. The file is just plain text with "hash + filename" on each line.

Lina: What do I use this list for later?
Linny-senpai: With the next step's -c, you can verify in bulk whether "the saved fingerprints" still match the current files. It's handy for periodic checks that a backup hasn't rotted.

4. Verifying with -c (the Main Event)

Conclusion: sha256sum -c listfile compares recorded hashes against the current files: OK if they match, FAILED if not.

4-1. Check Against a List

$ sha256sum -c SHA256SUMS
ubuntu.iso: OK
debian.iso: OK

If a file's contents changed, you get:

ubuntu.iso: OK
debian.iso: FAILED
sha256sum: WARNING: 1 computed checksum did NOT match
Lina: If I see FAILED, does that mean the file is broken?
Linny-senpai: It means "the saved fingerprint and the current contents differ." The cause is either a corrupted download or someone swapping the file. Either way, the right move is to not use that file.

4-2. Verify Against a Single Official Value

If a download page lists only one line like "SHA256: official value", turn that one line into a list file and verify.

# Write the official value and filename on one line (two spaces)
$ echo "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855  ubuntu.iso" > check.txt
$ sha256sum -c check.txt
ubuntu.iso: OK

Always two spaces

In a list file, the gap between the "hash" and the "filename" must be two spaces. With only one, the line is malformed and verification fails. Using sha256sum's own output avoids the mistake.

4-3. Useful Options for Verifying

# Hide OK lines, show only failures
$ sha256sum -c --quiet SHA256SUMS

# Ignore files not in the list (when checking only some)
$ sha256sum -c --ignore-missing SHA256SUMS

# Print nothing, judge by exit code (for scripts)
$ sha256sum -c --status SHA256SUMS && echo "all match"

--status prints nothing and decides solely by exit code (0 success / 1 failure). It's ideal for automated checks in shell scripts.

5. Corruption Detection Is Not Tamper Detection

Conclusion: Any hash can catch accidental corruption, but defending against malicious tampering requires SHA-256.

Lina: Aren't corruption detection and tamper detection the same thing?
Linny-senpai: Similar, but different. Corruption is a file "breaking on its own" from a transmission error or a bad disk. Tampering is an attacker deliberately swapping the contents while keeping the fingerprint the same. Resisting the latter requires a hash whose fingerprints can't be deliberately collided.

Two goals

Goal What it prevents Usable hashes
Corruption check Accidental transit/disk corruption MD5 / SHA-1 are fine
Tamper check Deliberate swaps by an attacker SHA-256 or stronger

6. Why MD5 Is Discouraged Today

Conclusion: MD5 and SHA-1 allow practical collision attacks ("different contents, same fingerprint"), so they're discouraged for security.

Lina: I see MD5 a lot — is it forbidden to use?
Linny-senpai: Not "forbidden" — it depends on the use case. MD5 and SHA-1 have practical collision attacks, meaning an attacker can deliberately craft two files with different contents but the same fingerprint.
Lina: So even a matching fingerprint isn't reassuring...
Linny-senpai: Right. So don't use MD5 / SHA-1 for tamper protection. But for merely catching accidental corruption in transit, they're still fine. When in doubt, choose SHA-256 and you won't go wrong.

7. Common Beginner Pitfalls

Conclusion: Misreading upper/lowercase, the number of spaces, and stray line endings are the typical causes of failed verification.

7-1. Comparing by Eye and Missing It

Comparing 64 digits by eye invites mistakes. Always verify mechanically with -c.

# Bad: compare by eye (you'll miss something)
# Good: make a list and verify with -c
$ sha256sum -c SHA256SUMS

7-2. A FAILED Caused by Spaces

If the separator in the list file is a single space, the line is malformed.

sha256sum: SHA256SUMS: 1 line is improperly formatted

Using a list made with sha256sum file > SHA256SUMS keeps the spacing correctly at two.

7-3. A Windows-Made List Won't Verify

Files from Windows use \r\n (CRLF) line endings, and the trailing \r can cause trouble.

# Convert CRLF to LF before verifying
$ tr -d '\r' < SHA256SUMS.txt | sha256sum -c -

7-4. Hashes Match but the Files "Look Different"

If only the filename differs but the contents are identical, the hash matches. Remember a hash looks only at contents — not the filename or timestamp.

8. Mini Exercises: Learn by Doing

Conclusion: Create your own file and walk through computing, saving, verifying, and detecting corruption to make it stick.

Lina: I've got the knowledge — I want to try it for real!
Linny-senpai: The best way is to make a test file and experiment. Here are three exercises.

Exercise 1: Create a file with echo hello > a.txt and print its SHA-256 hash.

Show Hint

Just pass the filename to the SHA-256 command.

Show Solution
$ echo hello > a.txt
$ sha256sum a.txt
5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03  a.txt

Exercise 2: Save a.txt's hash list to SUMS and verify it with -c to see OK.

Show Hint

Save with > SUMS, then verify with -c SUMS.

Show Solution
$ sha256sum a.txt > SUMS
$ sha256sum -c SUMS
a.txt: OK

Exercise 3: Modify a.txt, then verify again with -c and confirm you get FAILED.

Show Hint

Append something to the file, then run -c SUMS again.

Show Solution
$ echo world >> a.txt
$ sha256sum -c SUMS
a.txt: FAILED
sha256sum: WARNING: 1 computed checksum did NOT match

When the contents change, the fingerprint changes and verification turns to FAILED.

9. Copy-Paste Templates

Conclusion: The standard forms for computing, saving a list, verifying, and matching a single official value are ready to copy.

Checksum Templates

# Fingerprint a single file
sha256sum file

# Hash several files at once
sha256sum *.iso

# Save a list
sha256sum *.iso > SHA256SUMS

# Verify against a saved list
sha256sum -c SHA256SUMS

# Show only failures
sha256sum -c --quiet SHA256SUMS

# For scripts (judge by exit code)
sha256sum -c --status SHA256SUMS && echo OK

# Verify against a single official value
echo "<official hash>  file.iso" | sha256sum -c -

# When you want a stronger hash
sha512sum file

What not to do

  • Use MD5 / SHA-1 for tamper protection
  • Compare 64 digits by eye (verify mechanically with -c)
  • Use a single separator space (always two)

Next Reading