rsync in Practice: Backups and Mirroring
What You'll Learn
- How to use rsync differential sync for real backups and mirroring
- The correct use of
--delete,--link-dest, and--exclude- and how to avoid disasters - A safe workflow that prevents "I synced and it deleted everything" and "out of space" failures
Quick Summary
- One-way refresh ->
rsync -av src/ dst/ - Full mirror (delete extras too) ->
rsync -av --delete src/ dst/ - Keep generations (incremental) -> hard-link the previous run with
--link-dest - Always run
--dry-runbefore anything destructive
Assumptions (target environment)
- OS: Ubuntu (rsync 3.x)
- Both local <-> remote (SSH) scenarios
- For the basics, see scp / rsync Basics
What is rsync differential sync?
Conclusion: rsync compares files already on the destination and transfers only what changed. The fast second run is exactly why it fits backups and mirroring.
rsync is not just a copy tool - it is a sync tool that transfers only the difference between source and destination. The first run is a full copy, but every later run sends only changed files (and only changed blocks), so even large datasets refresh quickly.
That property is what makes it ideal for scheduled backups (refresh the same directory daily) and mirroring (keep two locations identical).
Why is differential transfer fast?
Conclusion: By default rsync skips files whose size and modification time both match. Only changed files - and within them, only changed blocks - are sent, so total transfer is small.
rsync's speed comes from two levels of difference detection.
- Per-file skip check: by default rsync compares each file's size and modification time (mtime); if both match, it treats the file as unchanged and skips it.
- Block-level delta transfer: for files marked as changed, a rolling checksum finds matching blocks and sends only the changed portions (the rsync algorithm).
# First run: full copy $ rsync -av src/ /backup/dst/ # Later runs: only the delta (near-instant if nothing changed) $ rsync -av src/ /backup/dst/
When mtimes are unreliable (clock skew, just after a filesystem migration), use --checksum (below) to compare actual file contents.
How do you write the basic backup form?
Conclusion:
rsync -av src/ dst/is the baseline. Archive mode (-a) copies recursively while preserving permissions, ownership, timestamps, and symlinks.
Real backups start with -a, which preserves attributes.
$ rsync -av /home/user/data/ /backup/data/
-a (archive mode) bundles these options.
| Included | Meaning |
|---|---|
-r |
Recursive (into subdirectories) |
-l |
Preserve symlinks as symlinks |
-p |
Preserve permissions |
-t |
Preserve timestamps |
-g / -o |
Preserve group / owner |
-D |
Preserve device and special files |
Common companions.
-v: show what is transferred (-vvfor more detail)-z: compress during transfer (good on slow links; can be slower on LANs)-h: human-readable sizes (pairs well with--progress)
The trailing-slash trap (most important)
rsync -av src/ dst/ # put the CONTENTS of src into dst rsync -av src dst/ # put the src DIRECTORY itself into dst (dst/src/...)
A trailing / on the source changes the result. Most "my backup tree is off by one level" incidents come from this.
How do you mirror a directory? (--delete)
Conclusion: For a true mirror, add
--deleteto remove destination files missing from the source. It is destructive, so always run--dry-runfirst.
A plain -a backup only adds and updates. Files you delete from the source stay on the destination forever. To make source and destination identical (mirroring), use --delete.
# Always dry-run first to see WHAT gets deleted $ rsync -av --delete --dry-run src/ /mirror/dst/ # If the output looks right, run for real $ rsync -av --delete src/ /mirror/dst/
In dry-run output, files to be removed appear on deleting ... lines.
sending incremental file list deleting old-report.csv deleting cache/tmp.dat ./ new-report.csv
--delete destroys the destination
If the source path is wrong (for example an empty directory), --delete can wipe every file at the destination. These combinations are especially dangerous.
- Missing or extra trailing slash on the source
- A
src/whosesrcpart is a variable that expanded to empty
Never run an rsync with --delete without a --dry-run first.
You can control when deletion happens.
--delete-after: delete only after the whole transfer succeeds (less likely to damage the destination on a mid-run failure)--delete-excluded: also remove files that--excludeskipped from the destination
How do you make generational (incremental) snapshots?
Conclusion:
--link-desthard-links unchanged files from a previous backup, so each generation looks like a full backup but only consumes disk for the differences.
A backup that overwrites the same directory every day cannot answer "restore the state from three days ago." To keep generations, use --link-dest.
--link-dest=DIR writes the destination by creating hard links to unchanged files in DIR (usually the previous backup). Identical files do not consume disk twice, so you can keep many generations efficiently.
# Make a dated snapshot directory each run
$ SRC=/home/user/data/
$ DEST=/backup/snapshots
$ TODAY=$(date +%F) # e.g. 2026-06-05
$ LATEST=$DEST/latest # symlink to the previous snapshot
$ rsync -av --delete \
--link-dest="$LATEST" \
"$SRC" "$DEST/$TODAY/"
# Point latest at the newest snapshot
$ ln -sfn "$DEST/$TODAY" "$LATEST"Now /backup/snapshots/2026-06-05/ shows every file, but files unchanged since yesterday share hard links with yesterday's run, so the extra real disk used is only the delta.
- The
--link-destpath is relative to the destination directory (or absolute). Watch this when using a relative path - To delete a generation, just
rm -rfits dated directory. Because of hard links, files still referenced by other generations are not actually removed
How do you exclude unwanted files? (--exclude)
Conclusion:
--exclude=PATTERNskips targets; when there are many, collect them in--exclude-from=FILE. Excluding caches, logs, and temp files makes backups lighter.
Including caches and temp files wastes both space and transfer time. Skip them with --exclude.
$ rsync -av --delete \
--exclude='*.tmp' \
--exclude='cache/' \
--exclude='node_modules/' \
src/ /backup/dst/For many patterns, put them in a file.
# Contents of .rsync-exclude (one pattern per line) # *.tmp # cache/ # node_modules/ # .git/ $ rsync -av --delete --exclude-from='.rsync-exclude' src/ /backup/dst/
A leading / in a pattern means relative to the source root.
--exclude='/cache': exclude onlycachedirectly under the source--exclude='cache/': excludecache/at any depth
Use --dry-run to confirm you are not excluding more than intended.
How do you control bandwidth, resume, and progress?
Conclusion:
-P(--partial --progress) gives resume and progress;--bwlimitcaps bandwidth. Both are essential for large data, slow links, or transfers during business hours.
Sending large data to a remote host can saturate the link or fail partway and start over. Control it with these options.
# Progress + keep partial files on interruption (resume continues) $ rsync -avP src/ user@server:/backup/dst/ # Cap bandwidth at 10 MB/s (good for backups during peak hours) $ rsync -av --bwlimit=10M src/ user@server:/backup/dst/ # Non-standard SSH port $ rsync -av -e "ssh -p 2222" src/ user@server:/backup/dst/
| Option | Effect |
|---|---|
-P |
--partial (keep partial files) + --progress |
--partial |
Keep partially transferred files and resume next time |
--bwlimit=RATE |
Upper transfer rate (e.g. 10M = 10 MB/s) |
-e "ssh -p PORT" |
Specify the remote shell (non-standard port, etc.) |
-z (compression) costs CPU. On a LAN or with already-compressed data (video, images, archives), dropping -z is often faster. It pays off on slow WAN links.
When should you use checksum comparison?
Conclusion:
--checksum(-c) compares file contents instead of size and mtime. Use it when timestamps cannot be trusted (clock skew, post-migration verification). Avoid it for routine syncs - it is slow.
The default "size + mtime" check is fast but can miss changes when timestamps are unreliable. --checksum computes a checksum for every file and compares by content - reliable but slow.
# Strict content-based diff (migration / integrity verification) $ rsync -avc src/ /backup/dst/
When to use it.
- Verify a copy is complete after a filesystem or server migration
- Guarantee identical content between source and destination (mtimes may differ)
--checksum reads every file on both sides, so it is very slow on large datasets. Use the default mtime check for routine sync and add --checksum only for verification.
Checklist to prevent disasters (summary)
Conclusion: Make
--dry-runmandatory for any rsync with--delete, and check trailing slashes and empty path variables every time. That alone prevents almost every serious rsync accident.
Pre-run checks (especially with --delete)
- [ ] Are source / destination paths correct (no empty-variable expansion)?
- [ ] Is the trailing slash intentional (
src/vssrcdiffer)? - [ ] Did you confirm deletions/transfers with
--dry-run? - [ ] Is there enough free space at the destination (
df -h)? :::
Copy-paste safe templates
# 1. One-way backup (add/update only)
rsync -av src/ /backup/dst/
# 2. Full mirror (dry-run first, always)
rsync -av --delete --dry-run src/ /mirror/dst/
rsync -av --delete src/ /mirror/dst/
# 3. Generational snapshot (hard-link the previous run)
rsync -av --delete --link-dest=/backup/snapshots/latest \
src/ /backup/snapshots/$(date +%F)/
# 4. To a remote host (progress + bandwidth cap + non-standard port)
rsync -avP --bwlimit=10M -e "ssh -p 2222" src/ user@server:/backup/dst/