Diagnosing Swap Thrashing on Linux: When Memory Pressure Makes It Slow

Diagnosing Swap Thrashing on Linux: When Memory Pressure Makes It Slow

What Is Swap Thrashing?

Conclusion: Physical memory is exhausted, so the kernel constantly evicts pages to swap (disk) and reads them back. The CPU is nearly idle while disk I/O saturates, and the whole system becomes extremely slow. The root cause is "not enough memory," but the symptom looks like "the disk is slow" or "only load average is high."

When memory runs short, Linux moves rarely-used pages out to the swap area (swap-out) to free RAM. That alone is normal. The problem starts when an evicted page is needed again almost immediately: read it back (swap-in), evict another page to make room, need that one again, and so on. This loop is thrashing, and the CPU spends its time waiting on page transfers instead of doing real work.

$ uptime
 14:32:10 up 8 days,  3:11,  2 users,  load average: 18.40, 16.92, 12.05

Load average spikes, yet top shows low CPU usage (us/sy) and a high wa (I/O wait). "The CPU is idle but everything is slow" is the classic signature of thrashing.

Assumptions (target environment)

  • OS: Ubuntu / general Linux
  • Symptom: the server suddenly becomes slow, unresponsive, even SSH lags
  • Swap is enabled (the Swap row in free is not 0)
  • You can read vmstat / free / /proc (permanent settings require sudo)

Why Watch the Rate, Not the Swap Usage?

Conclusion: Swap being used is not itself a problem. Pages from idle processes sitting quietly in swap is healthy. The danger is when swap-in / swap-out flows continuously and fast — only that correlates with the slowness you feel. Judge by the rate (vmstat si/so), not by usage (free).

Even if free -h shows Swap full, that may just be "evicted long ago and left there," which is harmless. Conversely, moderate swap usage with heavy per-second traffic will make the system feel frozen.

Aspect Healthy Thrashing
Swap usage (free) Stable, even if high Swings up and down rapidly
si/so (vmstat) Near 0 Continuously large
CPU Normal High wa (I/O wait)
Feel Fine Every action is slow

The primary indicator of thrashing is not "how much swap is filled" but "how much is being moved in and out per second right now." The first thing to check is the si / so columns of vmstat.

How Do You Observe It First? (vmstat / free)

Conclusion: Run vmstat 1. If si (swap-in KB/s) and so (swap-out KB/s) stay large, thrashing is confirmed. Cross-check the headroom with free -h and the high wa with top.

Start by streaming vmstat at a 1-second interval. Each row is the last second, so you see the "flow" of si/so.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 2 14 2087600  51200  10240 102400 4096 5120  4200 5300 3100 8900  3  4  8 85  0
 1 12 2091200  48300  10100 101800 3800 4900  3900 5100 2980 8500  2  3 10 85  0

si / so keep flowing at megabytes per second and wa (I/O wait) is over 80%. b (processes blocked) is also high. That is hard evidence of thrashing. Next, check the headroom.

$ free -h
               total        used        free      shared  buff/cache   available
Mem:            3.8Gi       3.5Gi       120Mi        12Mi       180Mi        90Mi
Swap:           2.0Gi       1.9Gi        80Mi

available is tiny and Swap is nearly full. Physical memory is exhausted and it can no longer offload to swap. Confirm with top.

$ top -bn1 | head -5
top - 14:32:40 up 8 days,  load average: 18.40, 16.92, 12.05
Tasks: 210 total,   1 running, 209 sleeping
%Cpu(s):  3.0 us,  4.0 sy,  0.0 ni,  8.0 id, 85.0 wa,  0.0 hi,  0.0 si
MiB Mem :   3891.0 total,    120.0 free,   3591.0 used,    180.0 buff/cache
MiB Swap:   2048.0 total,     80.0 free,   1968.0 used

wa dominates and id (idle) is tiny. The CPU is waiting on I/O completion, not computing.

If sar -B (the sysstat package) is available, you can go back through pgpgin/s pgpgout/s pswpin/s pswpout/s history to see "when it started." Combining real-time vmstat with historical sar makes it easier to pinpoint the onset time.

Which Process Is Eating Swap?

Conclusion: smem -s swap -r is the most readable way to see per-process swap usage. If smem is unavailable, sum up VmSwap from /proc/<pid>/status. The biggest swap consumer is usually the main cause of thrashing.

smem can show swap usage per process directly (sudo apt install smem).

$ sudo smem -s swap -r | head
  PID User     Command                         Swap      USS      PSS      RSS
 2314 www-data java -Xmx3g -jar app.jar      1245184   210432   215300   240128
 1190 mysql    /usr/sbin/mysqld               412300    98200   101400   130560
 1532 root     /usr/bin/dockerd               120400    40100    42300    61440

The process with the outsized Swap column is the culprit. If you cannot install smem (adding a new package is undesirable), sum /proc directly.

# List VmSwap of all processes, largest first (KB)
$ for f in /proc/[0-9]*/status; do
    awk '/^Name:/{n=$2} /^VmSwap:/{print $2, n, FILENAME}' "$f"
  done 2>/dev/null | sort -rn | head
1245184 java /proc/2314/status
412300 mysqld /proc/1190/status
120400 dockerd /proc/1532/status

A misconfigured memory reservation (a JVM -Xmx larger than physical RAM, an oversized DB buffer pool, etc.) is the typical cause. Lowering the limit to a value that fits physical memory is often the real fix.

High swap usage does not automatically mean "the culprit." To find "the process moving pages in and out right now," take smem several times during thrashing and watch which processes have a changing Swap value. A static one is just sleeping pages.

How Do You Stop the Slowness Right Now? (First Aid)

Conclusion: Stopping a runaway or oversized process, or lowering its memory limit and restarting, is the fastest fix. To reset accumulated swap, use swapoff -a && swapon -a, but running it with no free memory invites the OOM killer, so it is dangerous. First aid should only move in the direction of "freeing memory."

The safest and most effective step is to stop or reconfigure the oversized process you found with smem.

# Restart the culprit through the proper path (after reviewing its config)
$ sudo systemctl restart myapp.service

To "reset" by pulling swapped-out pages back into RAM, use swapoffswapon. But this reloads all swap contents into memory at once, so if free physical memory is below the swap usage, the OOM killer fires.

# Dangerous: invites OOM if free memory is less than swap usage
$ free -h          # Confirm available > Swap used first
$ sudo swapoff -a && sudo swapon -a

"Just reboot" makes the symptom disappear, but it will recur unless you fix the memory configuration or the leak. After first aid, always move on to identifying the root cause (oversized config / leak / plain insufficient RAM).

Should You Tune swappiness?

Conclusion: vm.swappiness is a tendency value (0–100, default 60) for "how aggressively to use swap." Lowering it makes the kernel less eager to evict application pages, which can improve the feel of an interactive server. But it does not solve physical memory shortage itself. The real fix is more RAM or less usage.

Check the current value.

$ cat /proc/sys/vm/swappiness
60

Lower it temporarily and observe (reverts on reboot).

# Start around 10 rather than 0
$ sudo sysctl -w vm.swappiness=10

Once you confirm the effect, make it permanent.

$ echo 'vm.swappiness = 10' | sudo tee /etc/sysctl.d/99-swappiness.conf
$ sudo sysctl --system
swappiness Tendency Suits
0–10 Avoid swap as much as possible Interactive servers, low latency
60 (default) Balanced General purpose
100 Swap aggressively Batch, throughput-oriented

Setting swappiness to 0 does not fully disable swap (it still swaps under memory pressure). Lowering it too far also upsets the balance with the file cache and can create a different slowness. Avoid extreme values and tune while watching vmstat for si/so to settle.

How Do You Cap a Service's Swap With cgroups?

Conclusion: For a systemd-managed service, MemoryMax / MemoryHigh impose a memory limit and contain thrashing so one service cannot drag down the whole host. When the limit is exceeded, only that service is subject to reclaim / OOM, and the rest are protected.

Set a memory limit on a specific service (e.g. an oversized Java app).

$ sudo systemctl edit myapp.service

Write the limits in the [Service] section.

[Service]
MemoryHigh=2G
MemoryMax=2.5G

MemoryHigh is a "soft limit" that triggers aggressive reclaim when exceeded; MemoryMax is a "hard limit" that triggers OOM kill when exceeded. Apply after saving.

$ sudo systemctl daemon-reload
$ sudo systemctl restart myapp.service
$ systemctl show -p MemoryMax myapp.service

Now the service's memory usage is capped by the cgroup and no longer spreads thrashing across the whole system.

Setting MemoryHigh slightly below MemoryMax creates a buffer zone where the service "is throttled and slows down" before being suddenly killed at the hard limit. In production, specifying both is safer than MemoryMax alone.

What If the Real Cause Is Insufficient RAM? (Adding Swap / Capacity)

Conclusion: If there is no misconfiguration and no leak, and memory is simply always short, adding hardware (more RAM) is the proper path. Adding swap is only a "better than dying to OOM" buffer; even with more swap, thrashing stays slow. Adding swap buys time; adding RAM solves it.

Steps to add a swap file to avoid imminent OOM (when you cannot add RAM immediately, e.g. in the cloud).

# Create a 2GB swap file
$ sudo fallocate -l 2G /swapfile
$ sudo chmod 600 /swapfile
$ sudo mkswap /swapfile
$ sudo swapon /swapfile

Make it permanent by adding to /etc/fstab.

/swapfile  none  swap  sw  0  0

Adding swap does not stop the paging while memory is short, so it stays "slow." Adding swap is a stopgap to avoid OOM kills; it does not cure the slowness of thrashing itself. If you thrash chronically, more RAM or reducing the workload (lowering per-process memory limits, reducing concurrency) is the correct approach.

Checklist When It Still Won't Improve

Conclusion: Thrashing means "insufficient physical memory" is confirmed. Confirm si/so with vmstat → identify the culprit with smem → fix the oversized config / leak → add RAM if still short. The cause converges in this layer. swappiness and cgroups are symptomatic; the root is the balance with the amount of memory.

  • [ ] Confirmed si / so in vmstat 1 are continuously large (rate, not usage)?
  • [ ] Confirmed wa (I/O wait) is high while us/sy are low in top?
  • [ ] Checked available and Swap headroom with free -h?
  • [ ] Identified the culprit with smem -s swap -r or VmSwap in /proc/*/status?
  • [ ] Does that process's memory config (JVM -Xmx, DB buffers) fit physical RAM?
  • [ ] Ruled out a memory leak (not monotonically increasing over time)?
  • [ ] Confirmed available > Swap used before first aid (swapoff)?
  • [ ] Considered more RAM / less concurrency if chronic?

Next Reading