tar in Practice: Incremental Backups and Exclusions

tar in Practice: Incremental Backups and Exclusions

What Is a tar Incremental Backup?

Conclusion: It archives only what changed since the last run. --listed-incremental records a snapshot and computes the diff from it.

Once you can compress and extract with tar Basics, three practical skills come next:

  • Incremental (differential) backups: capture only changes, not a full copy every time
  • Exclusions: drop node_modules, caches, and other noise
  • Destination and path control: fix exactly where and at what depth files extract

Assumptions for this guide

  • OS: Ubuntu (GNU tar)
  • tar --version reports GNU tar (incremental and exclude options are GNU-specific)
  • BSD tar (default on macOS) uses different options for some of these

Why Use --listed-incremental?

Conclusion: --listed-incremental tracks deletions and moves. --newer (mtime comparison) misses deletions and gets restores wrong.

There are two ways to do incremental backups:

Method Basis Tracks deletions Recommended
--listed-incremental (-g) snapshot file yes best
--newer / --after-date mtime (modify time) no limited

--newer only picks up "files newer than a given time," so it cannot record files deleted since the last run. On restore, files you removed reappear. --listed-incremental records directory state in a snapshot file (conventionally with a .snar extension), so it reproduces the diff correctly, including deletions.

mtime-based --newer is fine for a quick diff, but for generation management and accurate restores, use --listed-incremental.

How to Create an Incremental Backup

Conclusion: Keep pointing -g at the same snapshot file. The first run is automatically level 0 (full); later runs become level 1 and up (incremental).

Level 0 (full backup)

On the first run, when the snapshot file does not exist, you automatically get a full backup.

$ tar -czg /backup/home.snar -f /backup/home-full.tar.gz /home/user/
  • -g /backup/home.snar: snapshot file (short form of --listed-incremental)
  • -c: create / -z: gzip / -f: output file
  • After this run, home.snar holds the recorded directory state

Level 1 and up (incremental)

Run again with the same snapshot file, and only changes since the previous run are archived.

$ tar -czg /backup/home.snar -f /backup/home-inc1.tar.gz /home/user/

home.snar is updated on every run and drives the next diff. Use a different archive name (home-inc1, home-inc2...) per generation.

When you want a full backup every time

Point -g at /dev/null. Since the snapshot is never saved (writes to /dev/null vanish), every run is a level-0 full backup.

$ tar -czg /dev/null -f /backup/full.tar.gz /home/user/

Store the snapshot file separately from the backup data. Lose the .snar and the diff chain breaks: the next run may silently become a full backup, or your restore procedure falls apart.

How Do You Restore from Incremental Backups?

Conclusion: Extract in generation order, starting from level 0. Add --listed-incremental on extraction so deletions are replayed too.

Restore in the exact order the archives were created.

$ mkdir -p /restore
$ tar -xzg /dev/null -f /backup/home-full.tar.gz -C /restore
$ tar -xzg /dev/null -f /backup/home-inc1.tar.gz -C /restore
$ tar -xzg /dev/null -f /backup/home-inc2.tar.gz -C /restore
  • Pass /dev/null to -g on extraction (the snapshot contents are not read during restore, but the flag enables --listed-incremental mode)
  • In this mode, files deleted in an increment are also removed from the destination (a faithful generation restore)
Inspecting contents first

List an archive's contents before extracting.

$ tar -tzf /backup/home-inc1.tar.gz | head

An incremental archive includes directory entries alongside the changed files.

How to Specify Exclusion Patterns

Conclusion: Use --exclude=PATTERN for individual excludes and --exclude-from=FILE for a list. Place the options before the source paths.

Exclude individually

$ tar --exclude='*.log' --exclude='node_modules' \
      -czf project.tar.gz project/
  • Patterns are globs (* ? [...]) matched against member names in the archive
  • Put --exclude before the source path (project/); placed after, it may not take effect

Keep exclusions in a file

When there are many, use one pattern per line.

$ cat > exclude.txt <<'EOF'
*.log
*.tmp
node_modules
.cache
EOF

$ tar --exclude-from=exclude.txt -czf project.tar.gz project/

The short form of --exclude-from is -X.

Handy built-in options

Option Excludes
--exclude-vcs .git / .svn / CVS and other VCS metadata
--exclude-caches directories that contain a CACHEDIR.TAG file
$ tar --exclude-vcs --exclude='*.log' -czf src.tar.gz src/

How Do You Control the Destination and Path Depth?

Conclusion: -C fixes the destination directory, --strip-components=N removes leading path components, and --one-top-level confines everything under one parent.

Fix the destination (-C)

$ tar -xzf project.tar.gz -C /opt/app

The basic guard against files scattering into the current directory. Create the destination first.

Strip leading directories (--strip-components)

When an archive is one level deep, like project-1.0/src/..., peel that level off on extraction.

$ tar -xzf project-1.0.tar.gz --strip-components=1 -C /opt/app

This drops project-1.0/ and extracts straight into /opt/app/src/.... A staple for placing a GitHub source tarball into a fixed location.

Confine to one parent directory (--one-top-level)

For an archive whose files scatter at the top level, force everything under a single directory.

$ tar -xzf messy.tar.gz --one-top-level=extracted

extracted/ is created and everything extracts inside it (no mess).

In Practice: A Daily Backup Pattern

Conclusion: Run a weekly level 0 and a daily level 1. Pair an exclude list with a fixed destination to cut down on accidents.

Copy-paste template

# Weekly full (delete the .snar first if you want to reset the snapshot)
tar --exclude-from=/etc/backup/exclude.txt \
    -czg /backup/home.snar \
    -f /backup/home-$(date +%Y%m%d)-full.tar.gz \
    /home/user/

# Daily incremental (keep reusing the same .snar)
tar --exclude-from=/etc/backup/exclude.txt \
    -czg /backup/home.snar \
    -f /backup/home-$(date +%Y%m%d)-inc.tar.gz \
    /home/user/

# Restore (full then incremental; deletions replayed)
tar -xzg /dev/null -f /backup/home-...-full.tar.gz -C /restore
tar -xzg /dev/null -f /backup/home-...-inc.tar.gz  -C /restore

What not to do

  • Extracting an increment before the full backup (full always comes first)
  • Keeping the .snar snapshot only next to the archives (single point of loss)
  • Extracting straight into a production directory without fixing -C

For scheduling, see cron Basics; for capacity planning, see No space left on device.

Next Reading