xxhsum
Create Checksum File Recursively with xxhsum
Data integrity matters. Whether you're backing up files, distributing software, or just making sure nothing got corrupted during a transfer, checksums are your safety net. Enter xxhsum, a fast and reliable tool for creating and verifying file checksums on Linux. Here's how to use it effectively.
What Is xxhsum and Why Should You Care?
xxhsum is a command-line utility that generates hashes for files, allowing you to verify later that nothing has changed. It's faster than traditional tools like sha256sum and uses the XXHash algorithm, which is excellent for detecting accidental corruption. If you're managing large directories, repositories, or backups, xxhsum helps ensure your files stay intact.
Creating Checksums Recursively (Basic)
If you just want to hash all files in a folder and its subfolders quickly:
find /path/to/folder -type f -exec xxhsum -H xxh3 {} \; > checksums.xxh3
This command:
- Recursively finds every file in the folder and all subfolders
- Hashes each one using the xxh3 algorithm (fast and reliable)
- Saves the results to a
checksums.xxh3file
You now have a record of what all your files looked like at this moment in time.
The Sorted Approach (Better Consistency)
Want predictable, consistent output? Sorting filenames first ensures the checksum file is always in the same order, which is useful for version control or comparisons:
find /path/to/folder -type f | sort | xargs xxhsum -H xxh3 > checksums.xxh3
This takes slightly longer but produces output that's easier to read and compare across different checksum runs.
Create Checksum File (Excluding .git Folder)
If you're working with Git repositories or other folders with large ignored directories, you probably don't want to hash everything. The .git directory, in particular, can be enormous and unnecessary to hash.
Basic Command
find /path/to/folder -type f -not -path '*/.git/*' -exec xxhsum -H xxh3 {} \; > checksums.xxh3
This skips any files inside .git directories while hashing the rest.
Sorting Command
For consistency with excluded directories:
find /path/to/folder -type f -not -path '*/.git/*' | sort | xargs xxhsum -H xxh3 > checksums.txt
Performance Optimization: The -prune Method
Working with massive repositories? The -prune approach is faster because it tells find to skip the entire .git directory rather than finding files and filtering them afterward:
find /path/to/folder -name .git -prune -o -type f -print0 | xargs -0 xxhsum -H xxh3 > checksums.xxh3
This is notably quicker on large projects.
Sorted Optimization
Want both speed and consistency? Combine -prune with sorting:
find /path/to/folder -name .git -prune -o -type f -print0 | xargs -0 xxhsum -H xxh3 | sort -k2 > checksums.xxh3
This prunes .git efficiently, then sorts the output by filename for clean, reproducible results.
Verify Your Files
You've created checksums—now you want to verify that your files haven't been corrupted or modified. It's simple:
xxhsum -c checksums.xxh3
This reads your checksum file and compares it against the actual files. If everything matches, you'll see confirmation. If something's different, xxhsum will alert you to which files changed.
Choosing the Right Command for Your Situation
| Situation | Command | Why |
|---|---|---|
| Quick hash of a small folder | find /path -type f -exec xxhsum -H xxh3 {} \; |
Simple, fast, no unnecessary overhead |
| Need reproducible checksums | find /path -type f | sort | xargs xxhsum -H xxh3 |
Sorting ensures consistency across runs |
| Large Git repository | find /path -name .git -prune -o -type f -print0 | xargs -0 xxhsum -H xxh3 |
-prune is more efficient than filtering |
| Git repo + need sorting | find /path -name .git -prune -o -type f -print0 | xargs -0 xxhsum -H xxh3 | sort -k2 |
Best of both worlds: speed and consistency |
A Real-World Example
Let's say you're archiving a project before major changes. You want to know exactly what it looked like today:
find ~/my-project -name .git -prune -o -type f -print0 | xargs -0 xxhsum -H xxh3 | sort -k2 > project-backup-checksums.xxh3
Later, if you suspect files were modified:
xxhsum -c project-backup-checksums.xxh3
In seconds, you'll know if anything changed.
Why xxhsum Over Other Tools?
xxhsum is faster than sha256sum while still being cryptographically appropriate for integrity checking (though not for security purposes like password hashing).
It's perfect for verifying file integrity during backups, downloads, or archival. If you're already using Linux's command line regularly, adding xxhsum to your toolkit takes minutes and pays dividends in peace of mind.