vmstat, iostat, and sar - Understanding Linux Performance Analysis Tools
What You Will Learn
- What each tool does and how to read its output
- How to identify CPU, memory, I/O, and network bottlenecks
- When to reach for
vmstatvsiostatvssar
Quick Summary: Role of Each Tool
| Tool | Best For |
|---|---|
vmstat |
System-wide overview — CPU, memory, swap, I/O at a glance |
iostat |
Per-device I/O details — await, IOPS, utilization |
sar |
Historical data — trend analysis and post-incident review |
Prerequisites
- OS: Ubuntu or RHEL-based Linux
iostatandsarrequire thesysstatpackage- Install with:
sudo apt install sysstat
What Is vmstat?
vmstat (Virtual Memory Statistics) displays CPU, memory, swap, I/O, and process counts in a single output. It is the first tool to reach for when you need a quick system-wide picture to narrow down which subsystem is under pressure.
Basic Usage
vmstat [interval [count]]
$ vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 0 1543200 52480 921600 0 0 14 38 312 580 12 3 84 1 0 0 0 0 1540800 52480 922100 0 0 0 12 280 510 5 1 93 1 0 0 0 0 1539900 52480 922500 0 0 0 16 295 530 4 1 94 1 0
Reading the Fields
procs
r: Processes waiting to run. If this consistently exceeds your CPU count, you have CPU saturation.b: Processes blocked in uninterruptible sleep (typically waiting for I/O).
memory (in KB)
swpd: Swap in use. Any non-zero value warrants attention.free: Unused memory.buff/cache: Buffer and page cache — memory the OS holds for re-use, not wasted.
swap
si: Swap-in rate (disk → memory). Consistently non-zero means memory pressure.so: Swap-out rate (memory → disk). Consistently non-zero means memory pressure.
io (blocks/sec)
bi: Blocks read from block devices.bo: Blocks written to block devices.
cpu (%)
us: User-space CPU usage.sy: Kernel-space CPU usage.id: Idle time.wa: Time waiting for I/O. Sustained values above 10% indicate an I/O bottleneck.st: Time stolen by the hypervisor from this VM (virtualized environments only).
The first row is cumulative
The first line of vmstat output is the average since boot and is useful only as a baseline. Diagnose from the second row onward. Common patterns: vmstat 1 for continuous monitoring, vmstat 5 12 for a one-minute snapshot at five-second intervals.
What Is iostat?
iostat shows CPU summary statistics alongside per-device I/O metrics. After vmstat raises suspicion of an I/O bottleneck, iostat -x pinpoints which device is the culprit.
Installation (first time only)
$ sudo apt install sysstat # Ubuntu/Debian $ sudo dnf install sysstat # RHEL/Fedora
Basic Usage
$ iostat -x 1 5
Device r/s w/s rkB/s wkB/s await r_await w_await util% sda 1.20 5.30 48.00 212.00 2.50 1.80 2.70 3.20 nvme0n1 25.00 80.00 800.00 3200.00 0.48 0.40 0.52 8.50
Key Fields
| Field | Description | Warning Threshold |
|---|---|---|
await |
Average response time per request (ms) | HDD: >20ms / SSD: >1ms |
r_await |
Read response time (ms) | — |
w_await |
Write response time (ms) | — |
util% |
Device busy percentage | Concern: >80%, Saturation: 100% |
r/s, w/s |
Read/write operations per second (IOPS) | Compare against device spec |
When util% approaches 100%, the device is saturated — the I/O queue is building up and response times inflate. A spike in await alongside high util% confirms the device is the bottleneck. Note that util% is a device-level metric, not partition-level.
Filter to Specific Devices
$ iostat -x -d sda nvme0n1 1
What Is sar?
sar (System Activity Reporter) continuously collects CPU, memory, I/O, and network metrics and stores them as historical data. Its main value is answering questions like "what happened at 3 AM last night?" that real-time tools cannot.
Enabling sar
$ sudo apt install sysstat $ sudo systemctl enable sysstat --now
Once enabled, the sadc daemon writes records to /var/log/sa/saDD (where DD is the two-digit date). Data accumulates daily.
Common Options
$ sar -u 1 5 # CPU utilization $ sar -r 1 5 # Memory usage $ sar -b 1 5 # I/O statistics $ sar -n DEV 1 5 # Network stats per interface $ sar -n EDEV 1 5 # Network error statistics $ sar -q 1 5 # Load average and process counts
Reviewing Historical Data
# Today's CPU stats (all recorded intervals) $ sar -u # A full day of all metrics (-A flag) $ sar -A -f /var/log/sa/sa01
00:00:01 all 2.34 0.00 5.67 0.12 0.00 91.87 01:00:01 all 1.23 0.00 3.45 0.08 0.00 95.24 02:00:01 all 0.98 0.00 2.11 0.05 0.00 96.86
Changing the collection interval
The default collection interval is 10 minutes, configured in /etc/cron.d/sysstat or /etc/sysstat/sysstat. For production environments where you need post-incident precision, change it to 1–2 minutes. Keep in mind that shorter intervals increase disk usage proportionally.
Which Tool Should You Use?
Matching the right tool to your investigation question cuts diagnosis time significantly.
| Question | Tool | Key Fields |
|---|---|---|
| What is the overall system state? | vmstat 1 |
r, wa, si/so |
| Is CPU the bottleneck? | vmstat / sar -u |
r, us+sy, id |
| Which disk is slow? | iostat -x 1 |
await, util% |
| Is the system swapping? | vmstat |
si, so, swpd |
| What happened last night? | sar -u / -r / -b |
Time-series view |
| Is the network saturated? | sar -n DEV |
rxkB/s, txkB/s |
Practical Bottleneck Detection Patterns
Four common diagnostic sequences used in production environments.
1. Suspected CPU Bottleneck
# Step 1: System overview $ vmstat 1 10 # Watch if r (run queue) consistently exceeds CPU count # Step 2: Confirm with sar $ sar -u 1 10 # us + sy sustained above 90% → CPU is the bottleneck
2. Suspected Memory or Swap Pressure
# Step 1: Check swap activity $ vmstat 1 10 # si/so consistently non-zero → active swapping (memory pressure) # Step 2: Memory detail $ sar -r 1 5 # High %memused + increasing kbswpused → needs action
3. Suspected I/O Bottleneck
# Step 1: vmstat to confirm wa $ vmstat 1 5 # wa above 10% → I/O bottleneck suspected # Step 2: iostat to identify the device $ iostat -x 1 10 # Focus on devices with high await or util% above 80% # HDD warning: await >20ms / SSD warning: await >1ms
4. Suspected Network Issue
$ sar -n DEV 1 5 # rxkB/s / txkB/s: check bandwidth utilization # rxerr/s / txerr/s non-zero: possible hardware or driver issue