🩺 Troubleshooting

A reverse-lookup reference to quickly diagnose Linux errors and incidents by symptom. Covers Permission denied, No space left on device, SSH failures, high CPU load, memory pressure, DNS resolution failures, and more.

🔒 Permission denied

Permissions
Intermediate ⏱️ ~10 min

How to fix Permission denied

chmod / chown / sudo usage and avoiding common pitfalls.

View solution →

🌐 Connectivity / Network

Network
Intermediate ⏱️ ~12 min

SSH connection troubleshooting checklist

Permission denied (publickey), Host key verification failed, Connection timeout.

View solution →
Intermediate ⏱️ ~12 min

DNS resolution troubleshooting

dig / host / nslookup usage and /etc/resolv.conf inspection.

View solution →
Intermediate ⏱️ ~12 min

Port connectivity troubleshooting

nc / telnet / curl for connection check, ss / netstat for port state.

View solution →
Intermediate ⏱️ ~12 min

ufw firewall and SSH connection

Recovering when ufw enable cuts off SSH.

View solution →
Intermediate ⏱️ ~11 min

Diagnosing "Connection refused"

The host is reachable but actively rejects you. Tell a stopped service from a blocked port, and refused from timeout, in the fewest steps.

View solution →
Intermediate ⏱️ ~12 min

Diagnosing "Connection timed out"

A silent, long wait with no reply. Triage a firewall DROP, an unopened security group, a dead service, and a path blackhole, from your host outward.

View solution →
Intermediate ⏱️ ~11 min

Diagnosing "No route to host"

There is no path to deliver packets to the host. Triage a missing route, failed ARP, a router, and a firewall REJECT, from your host outward.

View solution →
Intermediate ⏱️ ~11 min

Fixing "Host key verification failed"

When a rebuilt server or reused IP makes the known_hosts key mismatch, remove the entry with ssh-keygen -R and reconnect safely.

View solution →
Intermediate ⏱️ ~11 min

Fixing "Permission denied (publickey)"

Check whether the key is offered, authorized_keys, permissions, and sshd_config from both client and server sides.

View solution →
Intermediate ⏱️ ~12 min

Fixing SSH Disconnects: Keepalive and Timeouts

When SSH drops after a pause, triage idle timeouts vs missing keepalive and fix it for good with ServerAlive / ClientAlive.

View solution →
Intermediate ⏱️ ~11 min

Fixing "Too many authentication failures"

Understand how offering too many ssh-agent keys exceeds MaxAuthTries, and fix it for good with IdentitiesOnly and ~/.ssh/config.

View solution →
Intermediate ⏱️ ~11 min

Fixing "certificate verify failed"

Fix "SSL certificate verify failed" in curl and Python. Triage CA bundle, incomplete chain, and clock skew with openssl s_client, then fix each cause.

View solution →
Intermediate ⏱️ ~9 min

Fixing "sudo: unable to resolve host"

Treat the "unable to resolve host" warning as a mismatch between hostname and /etc/hosts, then fix the 127.0.1.1 line to clear both the warning and the delay.

View solution →
Intermediate ⏱️ ~10 min

Fixing "Address already in use"

Find the process holding the port with ss / lsof / fuser, release it safely, and handle TIME_WAIT and double-start cases.

View solution →

📊 Server Performance

Performance
Intermediate ⏱️ ~10 min

Diagnosing 100% CPU usage

Identifying culprit processes with top / ps / load average.

View solution →
Intermediate ⏱️ ~12 min

Investigating memory pressure

free / top / ps usage and OOM Killer mitigation.

View solution →
Intermediate ⏱️ ~10 min

Diagnosing slow disk I/O

iostat / vmstat to pinpoint disk bottlenecks.

View solution →
Intermediate ⏱️ ~12 min

Handling OOM killer events

When a process is killed without explanation. Confirm with dmesg/journalctl, adjust oom_score_adj, and add swap.

View solution →
Intermediate ⏱️ ~10 min

Fixing "Too many open files" - Resolving File Descriptor Exhaustion

"Too many open files" means you hit the file descriptor limit. Use ulimit, lsof, limits.conf and sysctl to diagnose and fix it permanently.

View solution →
Intermediate ⏱️ ~8 min

Understanding and Handling Zombie Processes in Linux

Zombie processes are terminated processes that linger in the process table. Learn what causes them, how to find them, and how to clean them up safely.

View solution →
Intermediate ⏱️ ~11 min

Diagnosing High Load Average

Tell CPU-bound from I/O-bound when load average is high. How to read uptime, top, vmstat, and iostat to find the cause.

View solution →
Intermediate ⏱️ ~12 min

Fixing "Cannot allocate memory"

Fix Cannot allocate memory (ENOMEM) even when RAM looks free by diagnosing overcommit accounting, ulimit, and max_map_count, then fixing it permanently.

View solution →
Intermediate ⏱️ ~11 min

Fixing "fork: Resource temporarily unavailable"

Diagnose why fork fails with EAGAIN by checking the per-user ulimit, system pid_max, and cgroup TasksMax in order, then fix it permanently.

View solution →
Intermediate ⏱️ ~11 min

Diagnosing Swap Thrashing

Spot swap thrashing with vmstat si/so when a server suddenly slows, find the culprit process, and fix it permanently with swappiness and cgroups.

View solution →

💾 Disk Space

Disk
Intermediate ⏱️ ~10 min

Handling "No space left on device"

df / du to find bloated files and reclaim space.

View solution →
Intermediate ⏱️ ~10 min

Identifying Docker disk usage

docker system df to break down image / container / volume usage.

View solution →
Intermediate ⏱️ ~10 min

Safely deleting files with find

print0 / xargs -0 for filename-safe deletion.

View solution →
Intermediate ⏱️ ~10 min

Handling inode exhaustion

"No space left on device" while df -h shows free? Diagnose with df -i and recover.

View solution →
Intermediate ⏱️ ~8 min

Fixing "Argument list too long"

Why rm * fails and how to fix it with find -delete and find + xargs.

View solution →
Intermediate ⏱️ ~9 min

Disk Full but df Shows Space

Diagnose deleted-file handles and inode exhaustion with lsof / df -i.

View solution →

📜 Web Server / Logs

Logs
Intermediate ⏱️ ~12 min

Reading Nginx/Apache logs

access.log / error.log paths and how to read incident causes.

View solution →
Intermediate ⏱️ ~12 min

Fixing Server Time Skew - Synchronizing NTP with chrony and timedatectl

How to fix server time skew on Linux. Use timedatectl status to check NTP state, chronyc tracking to diagnose drift, and chronyc makestep to force immediate sync.

View solution →

💽 Filesystem / Devices

Filesystem
Intermediate ⏱️ ~11 min

Fixing "Read-only file system" Errors

A sudden read-only mount? Decide fast whether to remount or run fsck.

View solution →
Intermediate ⏱️ ~11 min

Diagnosing "Input/output error"

A file op returns EIO. Read the kernel log with dmesg, then tell a failing disk from FS corruption or a disconnect.

View solution →
Intermediate ⏱️ ~12 min

Repairing Filesystem Corruption with fsck

Repair ext4 corruption safely. Why you must unmount first, how to check the root FS, -n/-y/-f options, and the XFS difference.

View solution →
Intermediate ⏱️ ~11 min

Fixing "device is busy" on umount

umount fails with busy. Find who holds the filesystem with fuser/lsof and detach it safely.

View solution →
Intermediate ⏱️ ~10 min

"No such file or directory" When the File Exists

ls shows the file, yet you get No such file or directory. Isolate hidden chars, CRLF, broken links, and a missing dynamic linker.

View solution →
Intermediate ⏱️ ~11 min

Fixing "Stale file handle" on NFS

Recover an NFS Stale file handle with a remount, then pin fsid to stop it recurring.

View solution →
Intermediate ⏱️ ~9 min

Fixing "Text file busy"

A running binary can't be overwritten with cp. Find the process with lsof and swap it in safely with rename.

View solution →

📦 Package Management

Packages
Intermediate ⏱️ ~10 min

Fixing "Could not get lock"

When apt stalls with "Could not get lock", find the process holding the lock, release it safely, and recover an interrupted dpkg - in the fewest steps.

View solution →
Intermediate ⏱️ ~10 min

Fixing Broken Dependencies and Held Packages

When apt holds updates back or stops with unmet dependencies, learn to tell held from broken and repair the dependency tree without making it worse.

View solution →
Intermediate ⏱️ ~11 min

Fixing Repository GPG Key Errors (NO_PUBKEY)

Tell a NO_PUBKEY error from an expired EXPKEYSIG one, fetch the public key into a keyring, bind it with signed-by, and restore apt signature verification safely.

View solution →

🐚 Shell / Script Execution

Shell
Intermediate ⏱️ ~10 min

"command not found" for Your Own Scripts

Fix command not found for a script you wrote. Tell it apart from Permission denied, then run with ./, chmod +x, add to PATH, and clear the hash cache.

View solution →
Intermediate ⏱️ ~9 min

Fixing "bad interpreter"

A script fails with bad interpreter. Tell CRLF line endings from a wrong shebang path using file/cat -A, then fix it with dos2unix/sed.

View solution →
Intermediate ⏱️ ~10 min

Decoding "syntax error near unexpected token"

Read the bash syntax error near unexpected token message correctly. Understand what the token points to, then debug fi/then/done, (, newline, and end of file.

View solution →

⚙️ System / Services / Boot

System
Intermediate ⏱️ ~12 min

Debugging "Failed to start" systemd Services

Diagnose systemctl "Failed to start" fast. Work through status, journalctl, and exit codes to fix ExecStart, permissions, dependencies, and start-limit issues.

View solution →
Intermediate ⏱️ ~11 min

When Cron Jobs Don't Run

Diagnose why a cron job won't run fast. A checklist that walks the daemon, logs, PATH, environment, schedule syntax, and permissions in order.

View solution →
Intermediate ⏱️ ~12 min

When Linux Won't Boot - Kernel Panic and Emergency Mode

First steps when Linux won't boot. Triage kernel panic, the initramfs prompt, GRUB and systemd emergency mode, then recover with fstab fixes, fsck and an older kernel.

View solution →
Intermediate ⏱️ ~10 min

Fixing "cannot set LC_ALL" Locale Warnings

Diagnose "cannot set LC_ALL" and "setlocale" warnings from a missing locale, confirm with locale -a, generate it with locale-gen, and stop SSH forwarding for good.

View solution →
Intermediate ⏱️ ~10 min

Fixing Clock Skew and Certificate Errors

Diagnose "Clock skew detected" build warnings and "certificate is not yet valid" errors from a wrong clock, verify with date/openssl, and fix it for good.

View solution →

🐧 Practice your skills hands-on

Reading is just the start — practice in the browser-based terminal to build muscle memory.