Ubuntu CPU 100% Troubleshooting: top/ps/load average Guide
What You'll Learn
- How to quickly identify the process causing CPU 100%
- How to distinguish between "true CPU issue" vs "I/O wait" or "swap thrashing"
- How to save evidence (logs/values) before restart for prevention
Quick Summary
When CPU is high, follow this order:
- Overall status:
uptime(load average) - Find culprit with top:
top(CPU sort, also check %us/%sy/%wa) - ps for top list:
ps aux --sort=-%cpu | head - If it's a service, check logs:
systemctl status/journalctl -u - Rule out "looks like CPU but isn't": I/O wait (wa) / memory shortage (swap)
Table of Contents
Prerequisites
- OS: Ubuntu
- Target: Server beginners
sudoaccess- Goal: Isolation → Root cause → Prevention entry
1. Two Types of "CPU 100%" (Blind Spot)
When CPU is high, the cause is usually one of these:
Type A: Actually CPU Computation Is Heavy
- Calculation/loops/encryption/conversion/aggregation
- top shows specific process eating CPU
%us(user CPU) is usually high
Type B: Looks Like CPU But Actually I/O Wait / Memory Shortage
- Slow disk (log bloat, DB, EBS degradation, I/O limit)
- Swap growing = extremely slow (memory shortage)
%wa(I/O wait) is usually high
"CPU 100% so add more CPU" is premature. First check %wa and swap to confirm "Is it really CPU?"
2. uptime for load average (Overall Pressure)
$ uptime
Example output:
14:05:12 up 10 days, 3:20, 2 users, load average: 3.52, 3.10, 2.90
What load average means (simplified for beginners):
- If CPU cores = 4, load around 4 means "busy"
- Load at 10 means pretty congested
Note: load includes I/O wait, not just CPU, so next look at top for breakdown.
3. top for "Culprit" and "CPU Breakdown"
$ top
3-1. top Shortcuts (Must Know)
P: Sort by CPUM: Sort by memory (when memory is suspect)1: Show per-core CPU (check for imbalance)q: Quit
3-2. Reading CPU Breakdown (%us / %sy / %wa)
Shown at top of the display:
%us: App computation using CPU%sy: Kernel/system processing (network/disk/interrupts)%wa: I/O wait (disk slow/congested)
Quick interpretation:
%ushigh → App processing is heavy (Type A)%wahigh → I/O is congested (Type B)%syhigh → System-side processing is heavy
4. ps for CPU Top List (Save Evidence)
top shows a moment, but ps can capture a snapshot.
$ ps aux --sort=-%cpu | head -n 20
Key columns to look at:
COMMAND: What's eating CPU%CPU: Which is the outlier
If many of the same type (worker proliferation), likely missing limits or excessive load.
5. If It's a Service: systemctl / journalctl
If the process is nginx, apache2, or an app (php-fpm, node, gunicorn, etc.), logs often show the cause.
5-1. Service Status
$ sudo systemctl status nginx $ sudo systemctl status apache2 $ sudo systemctl status php8.1-fpm
5-2. Logs (Last 200 Lines)
$ sudo journalctl -u nginx -n 200 $ sudo journalctl -u php8.1-fpm -n 200
When CPU is high, often there's "massive requests", "error spam", or "restart loops" happening. First look at logs for patterns.
6. Rule Out "Actually Not CPU" (Important)
Skip this and you'll misdiagnose.
6-1. Check for Memory Shortage (Swap)
$ free -h
If swap is growing / available is small, might look like CPU but root cause is memory.
6-2. If %wa Is High, Suspect Disk I/O
- Log bloat
- DB
- Docker layer explosion
- Storage performance issues
$ iostat -x 1 5
If I/O wait is the cause, adding CPU won't help (wasted opportunity).
7. Common Cause Patterns (By Frequency)
Pattern 1: Infinite Loop / Bug / Runaway
- One process consuming CPU constantly (%us high)
- Fix: Check logs, recent deployments, recent changes
Pattern 2: Bot / Scan / Traffic Surge
- nginx/apache access logs exploding
- Fix: Check web logs, consider rate limiting/WAF/caching
Pattern 3: Too Many Workers (php-fpm, etc.)
- Child processes keep growing, eating CPU and memory
- Fix: Set worker limits (MaxChildren, etc.)
Pattern 4: I/O Is Slow, So CPU Appears High
- %wa is high
- Fix: Investigate disk I/O
8. Things to Avoid
Don't: Restart Without Saving Evidence
At minimum, capture these before deciding:
uptimefree -hps aux --sort=-%cpu | headtop(screenshot if possible)
Don't: Scale Up CPU Without Checking %wa
If I/O wait is the cause, CPU scaling is a waste (opportunity cost).
Don't: Leave Unlimited Worker Settings Alone
System will break the moment traffic increases. Set limits first.
Copy-Paste Template
# 1) Overview uptime # 2) Also check memory (rule out "actually not CPU") free -h # 3) CPU top list (evidence) ps aux --sort=-%cpu | head -n 20 # 4) top for breakdown (%us/%sy/%wa) and culprit top # 5) If it's a service, check logs sudo systemctl status <service> sudo journalctl -u <service> -n 200
Summary
- CPU high: first distinguish "true CPU" vs "I/O wait / memory shortage"
uptime → top → ps → journalctlis the fastest path- Restart is last resort. Save evidence before deciding.
Test Environment
Commands in this article were tested on Ubuntu 24.04 LTS / bash 5.2.