Diagnosing Swap Thrashing on Linux: When Memory Pressure Makes It Slow
What Is Swap Thrashing?
Conclusion: Physical memory is exhausted, so the kernel constantly evicts pages to swap (disk) and reads them back. The CPU is nearly idle while disk I/O saturates, and the whole system becomes extremely slow. The root cause is "not enough memory," but the symptom looks like "the disk is slow" or "only load average is high."
When memory runs short, Linux moves rarely-used pages out to the swap area (swap-out) to free RAM. That alone is normal. The problem starts when an evicted page is needed again almost immediately: read it back (swap-in), evict another page to make room, need that one again, and so on. This loop is thrashing, and the CPU spends its time waiting on page transfers instead of doing real work.
$ uptime 14:32:10 up 8 days, 3:11, 2 users, load average: 18.40, 16.92, 12.05
Load average spikes, yet top shows low CPU usage (us/sy) and a high wa (I/O wait). "The CPU is idle but everything is slow" is the classic signature of thrashing.
Assumptions (target environment)
- OS: Ubuntu / general Linux
- Symptom: the server suddenly becomes slow, unresponsive, even SSH lags
- Swap is enabled (the Swap row in
freeis not 0) - You can read
vmstat/free//proc(permanent settings requiresudo)
Why Watch the Rate, Not the Swap Usage?
Conclusion: Swap being used is not itself a problem. Pages from idle processes sitting quietly in swap is healthy. The danger is when swap-in / swap-out flows continuously and fast — only that correlates with the slowness you feel. Judge by the rate (
vmstatsi/so), not by usage (free).
Even if free -h shows Swap full, that may just be "evicted long ago and left there," which is harmless. Conversely, moderate swap usage with heavy per-second traffic will make the system feel frozen.
| Aspect | Healthy | Thrashing |
|---|---|---|
Swap usage (free) |
Stable, even if high | Swings up and down rapidly |
si/so (vmstat) |
Near 0 | Continuously large |
| CPU | Normal | High wa (I/O wait) |
| Feel | Fine | Every action is slow |
The primary indicator of thrashing is not "how much swap is filled" but "how much is being moved in and out per second right now." The first thing to check is the si / so columns of vmstat.
How Do You Observe It First? (vmstat / free)
Conclusion: Run
vmstat 1. Ifsi(swap-in KB/s) andso(swap-out KB/s) stay large, thrashing is confirmed. Cross-check the headroom withfree -hand the highwawithtop.
Start by streaming vmstat at a 1-second interval. Each row is the last second, so you see the "flow" of si/so.
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 14 2087600 51200 10240 102400 4096 5120 4200 5300 3100 8900 3 4 8 85 0 1 12 2091200 48300 10100 101800 3800 4900 3900 5100 2980 8500 2 3 10 85 0
si / so keep flowing at megabytes per second and wa (I/O wait) is over 80%. b (processes blocked) is also high. That is hard evidence of thrashing. Next, check the headroom.
$ free -h
total used free shared buff/cache available Mem: 3.8Gi 3.5Gi 120Mi 12Mi 180Mi 90Mi Swap: 2.0Gi 1.9Gi 80Mi
available is tiny and Swap is nearly full. Physical memory is exhausted and it can no longer offload to swap. Confirm with top.
$ top -bn1 | head -5
top - 14:32:40 up 8 days, load average: 18.40, 16.92, 12.05 Tasks: 210 total, 1 running, 209 sleeping %Cpu(s): 3.0 us, 4.0 sy, 0.0 ni, 8.0 id, 85.0 wa, 0.0 hi, 0.0 si MiB Mem : 3891.0 total, 120.0 free, 3591.0 used, 180.0 buff/cache MiB Swap: 2048.0 total, 80.0 free, 1968.0 used
wa dominates and id (idle) is tiny. The CPU is waiting on I/O completion, not computing.
If sar -B (the sysstat package) is available, you can go back through pgpgin/s pgpgout/s pswpin/s pswpout/s history to see "when it started." Combining real-time vmstat with historical sar makes it easier to pinpoint the onset time.
Which Process Is Eating Swap?
Conclusion:
smem -s swap -ris the most readable way to see per-process swap usage. Ifsmemis unavailable, sum upVmSwapfrom/proc/<pid>/status. The biggest swap consumer is usually the main cause of thrashing.
smem can show swap usage per process directly (sudo apt install smem).
$ sudo smem -s swap -r | head
PID User Command Swap USS PSS RSS 2314 www-data java -Xmx3g -jar app.jar 1245184 210432 215300 240128 1190 mysql /usr/sbin/mysqld 412300 98200 101400 130560 1532 root /usr/bin/dockerd 120400 40100 42300 61440
The process with the outsized Swap column is the culprit. If you cannot install smem (adding a new package is undesirable), sum /proc directly.
# List VmSwap of all processes, largest first (KB)
$ for f in /proc/[0-9]*/status; do
awk '/^Name:/{n=$2} /^VmSwap:/{print $2, n, FILENAME}' "$f"
done 2>/dev/null | sort -rn | head1245184 java /proc/2314/status 412300 mysqld /proc/1190/status 120400 dockerd /proc/1532/status
A misconfigured memory reservation (a JVM -Xmx larger than physical RAM, an oversized DB buffer pool, etc.) is the typical cause. Lowering the limit to a value that fits physical memory is often the real fix.
High swap usage does not automatically mean "the culprit." To find "the process moving pages in and out right now," take smem several times during thrashing and watch which processes have a changing Swap value. A static one is just sleeping pages.
How Do You Stop the Slowness Right Now? (First Aid)
Conclusion: Stopping a runaway or oversized process, or lowering its memory limit and restarting, is the fastest fix. To reset accumulated swap, use
swapoff -a && swapon -a, but running it with no free memory invites the OOM killer, so it is dangerous. First aid should only move in the direction of "freeing memory."
The safest and most effective step is to stop or reconfigure the oversized process you found with smem.
# Restart the culprit through the proper path (after reviewing its config) $ sudo systemctl restart myapp.service
To "reset" by pulling swapped-out pages back into RAM, use swapoff → swapon. But this reloads all swap contents into memory at once, so if free physical memory is below the swap usage, the OOM killer fires.
# Dangerous: invites OOM if free memory is less than swap usage $ free -h # Confirm available > Swap used first $ sudo swapoff -a && sudo swapon -a
Do not casually run swapoff -a in the middle of thrashing. It tries to bring every swapped page back into memory, and if there is not enough, the kernel kills processes. Always stop the culprit and create headroom in free before running it.
"Just reboot" makes the symptom disappear, but it will recur unless you fix the memory configuration or the leak. After first aid, always move on to identifying the root cause (oversized config / leak / plain insufficient RAM).
Should You Tune swappiness?
Conclusion:
vm.swappinessis a tendency value (0–100, default 60) for "how aggressively to use swap." Lowering it makes the kernel less eager to evict application pages, which can improve the feel of an interactive server. But it does not solve physical memory shortage itself. The real fix is more RAM or less usage.
Check the current value.
$ cat /proc/sys/vm/swappiness
60
Lower it temporarily and observe (reverts on reboot).
# Start around 10 rather than 0 $ sudo sysctl -w vm.swappiness=10
Once you confirm the effect, make it permanent.
$ echo 'vm.swappiness = 10' | sudo tee /etc/sysctl.d/99-swappiness.conf $ sudo sysctl --system
| swappiness | Tendency | Suits |
|---|---|---|
| 0–10 | Avoid swap as much as possible | Interactive servers, low latency |
| 60 (default) | Balanced | General purpose |
| 100 | Swap aggressively | Batch, throughput-oriented |
Setting swappiness to 0 does not fully disable swap (it still swaps under memory pressure). Lowering it too far also upsets the balance with the file cache and can create a different slowness. Avoid extreme values and tune while watching vmstat for si/so to settle.
How Do You Cap a Service's Swap With cgroups?
Conclusion: For a systemd-managed service,
MemoryMax/MemoryHighimpose a memory limit and contain thrashing so one service cannot drag down the whole host. When the limit is exceeded, only that service is subject to reclaim / OOM, and the rest are protected.
Set a memory limit on a specific service (e.g. an oversized Java app).
$ sudo systemctl edit myapp.service
Write the limits in the [Service] section.
[Service] MemoryHigh=2G MemoryMax=2.5G
MemoryHigh is a "soft limit" that triggers aggressive reclaim when exceeded; MemoryMax is a "hard limit" that triggers OOM kill when exceeded. Apply after saving.
$ sudo systemctl daemon-reload $ sudo systemctl restart myapp.service $ systemctl show -p MemoryMax myapp.service
Now the service's memory usage is capped by the cgroup and no longer spreads thrashing across the whole system.
Setting MemoryHigh slightly below MemoryMax creates a buffer zone where the service "is throttled and slows down" before being suddenly killed at the hard limit. In production, specifying both is safer than MemoryMax alone.
What If the Real Cause Is Insufficient RAM? (Adding Swap / Capacity)
Conclusion: If there is no misconfiguration and no leak, and memory is simply always short, adding hardware (more RAM) is the proper path. Adding swap is only a "better than dying to OOM" buffer; even with more swap, thrashing stays slow. Adding swap buys time; adding RAM solves it.
Steps to add a swap file to avoid imminent OOM (when you cannot add RAM immediately, e.g. in the cloud).
# Create a 2GB swap file $ sudo fallocate -l 2G /swapfile $ sudo chmod 600 /swapfile $ sudo mkswap /swapfile $ sudo swapon /swapfile
Make it permanent by adding to /etc/fstab.
/swapfile none swap sw 0 0
Adding swap does not stop the paging while memory is short, so it stays "slow." Adding swap is a stopgap to avoid OOM kills; it does not cure the slowness of thrashing itself. If you thrash chronically, more RAM or reducing the workload (lowering per-process memory limits, reducing concurrency) is the correct approach.
Checklist When It Still Won't Improve
Conclusion: Thrashing means "insufficient physical memory" is confirmed. Confirm si/so with
vmstat→ identify the culprit withsmem→ fix the oversized config / leak → add RAM if still short. The cause converges in this layer. swappiness and cgroups are symptomatic; the root is the balance with the amount of memory.
- [ ] Confirmed
si/soinvmstat 1are continuously large (rate, not usage)? - [ ] Confirmed
wa(I/O wait) is high whileus/syare low intop? - [ ] Checked
availableand Swap headroom withfree -h? - [ ] Identified the culprit with
smem -s swap -ror VmSwap in/proc/*/status? - [ ] Does that process's memory config (JVM
-Xmx, DB buffers) fit physical RAM? - [ ] Ruled out a memory leak (not monotonically increasing over time)?
- [ ] Confirmed
available > Swap usedbefore first aid (swapoff)? - [ ] Considered more RAM / less concurrency if chronic?