Fixing "fork: Resource temporarily unavailable"

Fixing "fork: Resource temporarily unavailable"

What does "fork: Resource temporarily unavailable" mean?

Conclusion: A call to fork(2) (or clone(2)) was temporarily refused because a limit was hit. The kernel returns EAGAIN, and the shell or app prints "Resource temporarily unavailable". It is not always low memory; the cause is usually a process or thread count limit.

This error appears the instant a program tries to spawn a child process or thread. fork() / clone() returns EAGAIN (errno 11), which gets stringified and printed.

$ ./myapp
fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: Resource temporarily unavailable

The shell itself can emit it, in which case you cannot even launch a new command. The key point: this is not a disk or network problem. You have exhausted the kernel's budget for "how many tasks may exist at once", so only new creation is blocked. Existing processes keep running, so the symptom is narrowly "I can't start anything new".

Assumptions (target environment)

  • OS: Ubuntu / a typical Linux (systemd-based)
  • Symptom: fork / clone fails with Resource temporarily unavailable
  • You can read ulimit / /proc / systemctl (some permanent changes need sudo)

Why does fork fail with EAGAIN?

Conclusion: The cause is one of four: (1) the per-user process limit (RLIMIT_NPROC = ulimit -u), (2) the system-wide PID / thread ceiling (pid_max / threads-max), (3) the cgroup task limit (systemd TasksMax), or (4) memory exhaustion. Most often it is (1) or the cgroup-based (3).

Here are the paths that make fork return EAGAIN, grouped by which limit is hit.

Cause The actual limit Where to start checking
Per-user process limit RLIMIT_NPROC (ulimit -u) ulimit -u / ps -u <user>
System-wide PID ceiling kernel.pid_max /proc/sys/kernel/pid_max
System-wide thread ceiling kernel.threads-max /proc/sys/kernel/threads-max
cgroup task limit systemd TasksMax (pids cgroup) systemctl show -p TasksMax
Memory exhaustion Cannot allocate task structures free -h / dmesg

Work from the nearest budget outward. Start with "the limit on this user itself (ulimit -u)", then "the TasksMax of the cgroup that bundles the service", and finally "the system-wide pid_max / threads-max". Web apps and containers that spawn many threads usually hit the cgroup limit (3) first.

ulimit -u is described as a "process count", but internally Linux counts every thread as one task. A multithreaded app can hit the limit "with only a few processes", so always count current usage in thread units.

How do I check the current process / thread count and limits?

Conclusion: First read the limit with ulimit -u, count current usage with ps -eLf | wc -l (or per user), then compare. If usage is pinned at the limit, that budget is the culprit.

Start with the limit in effect for your current shell.

$ ulimit -u
4096

Next, count the current tasks (threads included). The -L in ps -eLf expands each thread to its own line, matching the unit fork/clone counts.

# Total threads system-wide (approx., includes 1 header line)
$ ps -eLf | wc -l

# Threads owned by a specific user only
$ ps -L -u www-data | wc -l
3987

If current usage (3987) is pinned just under the limit (4096), the budget is spent. To see which process is mass-producing tasks, sort by thread count (nlwp) descending.

$ ps -eo pid,nlwp,user,comm --sort=-nlwp | head
  PID NLWP USER     COMMAND
 2314 1820 www-data java
 1190  214 mysql    mysqld

The process with a runaway NLWP (thread count) is the real source. If it is an unintended thread leak (a misconfigured connection pool or worker count), suspect the application before raising any limit.

Before raising a limit, decide whether usage is "a healthy number that is simply too small" or "abnormally high due to a leak". Papering over a leak by raising the limit only makes it recur later at a higher ceiling.

How do I fix the per-user limit (ulimit -u / nproc)?

Conclusion: Temporarily you can widen it with ulimit -u <n>, but to make it apply across logins, set nproc in /etc/security/limits.conf (or limits.d/). Daemons do not get limits.conf, so they need the systemd setting instead.

First widen only the current shell to confirm the symptom disappears.

# Raise the soft limit, up to the hard limit
$ ulimit -u 8192

The permanent setting differs by login path. Interactive logins (SSH / console) honor limits.conf.

$ sudo nano /etc/security/limits.conf
# <domain> <type> <item> <value>
www-data  soft  nproc  8192
www-data  hard  nproc  16384
*         soft  nproc  4096

soft is the default value, hard is the ceiling you may raise to. These are applied via PAM's pam_limits.so, so you must log in again after editing. On Ubuntu, the recommended layout is to drop files into /etc/security/limits.d/*.conf, which are read after the main file.

limits.conf applies only to login sessions that go through PAM. It is not applied to daemons (Nginx, MySQL, etc.) started by systemd. Set process limits for services with the cgroup / systemd settings in the next section. Mixing these up leads to "I changed it but nothing happened".

What if the cgroup / systemd TasksMax is the cause?

Conclusion: Services and login sessions managed by systemd are governed by the cgroup pids.max (= TasksMax). If editing limits.conf changes nothing, this is almost always the culprit. Check the current value with systemctl show -p TasksMax, then raise it with a drop-in.

systemd bundles each service and each user session into a cgroup and caps the total task count with TasksMax. First read the value in effect.

# Limit for a specific service
$ systemctl show -p TasksMax nginx.service

# System default (also UserTasksMax, etc.)
$ systemctl show -p DefaultTasksMax
TasksMax=4915

DefaultTasksMax is often a percentage of the kernel's pid_max (15% by default); without a per-service override, this default applies. To raise a service limit, create a drop-in.

$ sudo systemctl edit nginx.service

Write the following in the editor (TasksMax under [Service]).

[Service]
TasksMax=infinity

Apply it after saving.

$ sudo systemctl daemon-reload
$ sudo systemctl restart nginx.service
$ systemctl show -p TasksMax nginx.service

The budget for login users lives in UserTasksMax (/etc/systemd/logind.conf), applied to the user-<UID>.slice. Check this too if an interactive user running many processes gets stuck.

TasksMax=infinity means "no limit", but removing the cap lets a leak drag down the whole system (pid_max). Use it only after confirming the cause is not a leak; normally set a concrete value of "needed + headroom" instead.

How do I adjust the system-wide limits (pid_max / threads-max)?

Conclusion: If even the total across all users is short, raise kernel.pid_max and kernel.threads-max. Change them temporarily with sysctl and persist them under /etc/sysctl.d/. This applies to hosts with many containers or heavy concurrency.

Check the system-wide ceilings for concurrent PIDs and threads.

$ cat /proc/sys/kernel/pid_max
$ cat /proc/sys/kernel/threads-max
4194304
65536

Raise them temporarily with sysctl -w. Persist by placing a file under /etc/sysctl.d/.

# Temporary change (lost on reboot)
$ sudo sysctl -w kernel.pid_max=4194304
$ sudo sysctl -w kernel.threads-max=131072

# Persist
$ echo 'kernel.pid_max = 4194304' | sudo tee /etc/sysctl.d/99-pids.conf
$ sudo sysctl --system

On 64-bit systems pid_max can go up to 4194304 (~4.19 million). But each thread consumes kernel memory, so raising it blindly exhausts memory first. The default for threads-max is auto-computed from RAM, so check headroom with free -h before raising it.

Even after raising pid_max / threads-max, if memory is short, fork will instead fail with ENOMEM (Cannot allocate memory). When EAGAIN turns into ENOMEM, memory — not the limit — is the bottleneck. See Handling the OOM killer.

Emergency recovery: when even the shell cannot fork

Conclusion: Even when you cannot launch a new command, shell builtin commands run without fork. Use kill, exec, and job control to stop the runaway processes and free up the budget without spawning any external command.

When fork is exhausted, you cannot even start external commands like ps or /usr/bin/kill. Here you fight with bash builtins only, which do not create a new process.

# The builtin kill (not external /bin/kill) signals a job or PID
$ kill %1
$ kill 2314

You can even locate PIDs without external commands, by peeking into /proc with the builtin echo and globbing.

# List your processes via globbing (no external command needed)
$ for p in /proc/[0-9]*; do echo "$p"; done

If it is still out of control, use exec to replace your own runaway session, or work from another existing login session (an SSH connection that is still alive). The last resort is a reboot from the console / cloud dashboard, but without finding the root cause (the leak source) it will recur.

Before SSH stops accepting new connections, keep one recovery session connected. Limit-exhaustion incidents tend to "block new logins by the time you notice", so a spare session separate from your working one is a lifesaver.

Checklist when it still won't resolve

Conclusion: A fork EAGAIN means "the task-count budget is exhausted" is confirmed. Check from the nearest budget outward — per-user ulimit -u → cgroup TasksMax → system pid_max/threads-max — and the cause converges to one of these layers. Don't forget to confirm there is no leak.

  • [ ] Did you distinguish EAGAIN (Resource temporarily unavailable) from ENOMEM (Cannot allocate memory)?
  • [ ] Did you compare the ulimit -u value against the current thread count from ps -L?
  • [ ] Did you identify the culprit with ps -eo pid,nlwp,... --sort=-nlwp?
  • [ ] Is that a healthy count, or a thread / process leak?
  • [ ] Did you decide whether the target is a login session or a systemd daemon (limits.conf vs TasksMax)?
  • [ ] Is systemctl show -p TasksMax <service> large enough?
  • [ ] If the global total is short, did you check kernel.pid_max / kernel.threads-max?
  • [ ] After raising a limit, is there memory headroom (free -h)?

Next Reading