Process Management Practical: Advanced Control and Monitoring Techniques

Process Management Practical - System Monitoring and Optimization

January 11, 2025 Reading Time: ~15 min Level: Intermediate to Advanced

After mastering process management basics, let's acquire practical operational techniques. Learn advanced skills needed in real work like job control, system monitoring, and troubleshooting.

Job Control
nice/renice - Priority Management
System Monitoring Tools
Troubleshooting

1. Job Control

Managing background and foreground jobs in the shell.

Background Execution

Start Command in Background

$ long_running_command &
[1] 12345

Adding & at the end of a command runs it in the background.

Move Running Command to Background

$ long_running_command
^Z                    # Suspend with Ctrl+Z
[1]+  Stopped     long_running_command
$ bg                  # Resume in background
[1]+ long_running_command &

Job Management Commands

List Jobs

$ jobs

[1]-  Running     command1 &
[2]+  Stopped     command2

Bring to Foreground

$ fg %1    # Bring job number 1 to foreground

Resume in Background

$ bg %2    # Resume stopped job 2 in background

Continue After Logout with nohup

$ nohup long_running_command &
$ disown %1    # Detach from current shell

💡 Practical Tips

nohup: Use to continue processes after SSH disconnection
disown: Completely detach job from shell
screen/tmux: Recommended for more advanced session management

2. nice/renice - Priority Management

Adjust process execution priority (nice value).

Nice Value Range

-20: Highest priority (requires root privileges)
0: Default priority
+19: Lowest priority

Smaller values mean higher priority.

Start Process with Low Priority

$ nice -n 10 backup_script.sh

Execute backup script with nice value 10 (low priority)

Change Running Process Priority

$ renice +5 -p 1234

Change nice value of process PID 1234 to 5

Execute with High Priority (root privileges)

$ sudo nice -n -10 critical_process

Execute critical process with high priority

3. System Monitoring Tools

Overall System Load Check

uptime - System Uptime and Load

$ uptime

14:30:01 up 5 days, 2:15, 3 users, load average: 0.15, 0.25, 0.20

Load average: 1, 5, and 15-minute average loads

vmstat - Virtual Memory Statistics

$ vmstat 1 5    # Display 5 times at 1-second intervals

CPU, memory, I/O, and swap statistics

iostat - I/O Statistics

$ iostat -x 1 5    # Detailed I/O statistics

Detailed disk I/O statistics

sar - System Activity Report

$ sar -u 1 5    # CPU usage
$ sar -r 1 5    # Memory usage

Comprehensive system performance data

Process Detailed Information

/proc Filesystem

$ cat /proc/1234/status      # Process status
$ cat /proc/1234/cmdline     # Command line
$ cat /proc/1234/environ     # Environment variables

Files Opened by Process

$ lsof -p 1234               # Specific process
$ lsof /var/log/syslog       # Specific file

Network Connections

$ netstat -tulpn             # All connections
$ ss -tulpn                  # Faster alternative

4. Troubleshooting

Case 1: High CPU Usage

Diagnostic Procedure

# 1. Identify high CPU processes
$ top -o %CPU
$ ps aux --sort=-%cpu | head -10

# 2. Investigate process details
$ strace -p PID    # Trace system calls

Case 2: Memory Shortage

Diagnostic Procedure

# 1. Check memory usage
$ free -h
$ ps aux --sort=-%mem | head -10

# 2. Check swap usage
$ swapon -s
$ vmstat 1 5

Case 3: Zombie Processes

Resolution Method

# 1. Check for zombie processes
$ ps aux | grep -w Z

# 2. Identify and restart parent process
$ ps -eo pid,ppid,state,comm | grep Z
$ kill -HUP parent_process_PID

Case 4: Unresponsive Process

Gradual Approach

# 1. Try graceful termination
$ kill -TERM PID

# 2. Wait a few seconds, check status
$ ps -p PID

# 3. Force kill as last resort
$ kill -KILL PID

📊 Simple Monitoring Script Example

#!/bin/bash
# System monitoring script

LOG_FILE="/var/log/system_monitor.log"
THRESHOLD_CPU=80
THRESHOLD_MEM=90

# CPU usage check
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | sed 's/%us,//')
if (( $(echo "$CPU_USAGE > $THRESHOLD_CPU" | bc -l) )); then
    echo "$(date): High CPU usage: $CPU_USAGE%" >> $LOG_FILE
fi

# Memory usage check
MEM_USAGE=$(free | grep Mem | awk '{printf("%.1f", $3/$2 * 100.0)}')
if (( $(echo "$MEM_USAGE > $THRESHOLD_MEM" | bc -l) )); then
    echo "$(date): High memory usage: $MEM_USAGE%" >> $LOG_FILE
fi

⚠️ Common Mistakes in Practice and Pitfalls

Common mistakes in actual operations and professional responses.

🚫 Mistake 1: Misusing nohup

❌ Common Mistake

$ nohup long_command  # Forgetting &
$ nohup long_command &
$ exit  # Exiting without disown

Forgetting backgrounding or causing processes to remain running.

✅ Correct Usage

# Complete command
$ nohup long_command > output.log 2>&1 &

# Safer method
$ screen -S session_name
$ long_command  # Run within screen session
$ Ctrl+A, D    # Detach session

Properly redirect output and manage processes.

🚫 Mistake 2: Job Control Confusion

❌ Confusing Example

$ command1 &
$ command2 &
$ jobs    # Can't tell which is which
$ fg %1   # Select wrong job

Difficult to identify when multiple jobs are running.

✅ Manageable Method

# Add meaningful comments
$ backup_script.sh &    # Backup job
$ jobs -l               # Check with PID
$ ps aux | grep backup  # Check by name

# Named sessions with tmux/screen
$ tmux new-session -d -s backup 'backup_script.sh'
$ tmux list-sessions

Manage jobs with meaningful names.

🚫 Mistake 3: Misunderstanding nice Values

❌ Incorrect Understanding

# Thinking higher nice value = faster
$ nice -n 19 important_process  # Lowest priority!
$ renice -20 $$  # Regular user specifying highest priority

Understanding nice values backwards.

✅ Correct Understanding and Usage

# Background tasks at low priority
$ nice -n 10 backup_script.sh

# Important processes at high priority (requires root)
$ sudo nice -n -5 critical_process

# Change priority of existing process
$ sudo renice -10 -p 1234

Understand that smaller nice values mean higher priority.

🚫 Mistake 4: Excessive or Insufficient Monitoring

❌ Problematic Monitoring

# Excessive monitoring (top every second)
$ while true; do top -n 1; sleep 1; done

# Insufficient monitoring
$ ps aux | grep myprocess  # Check only once

Can increase system load or miss problems.

✅ Appropriate Monitoring Methods

# Monitoring at appropriate intervals
$ watch -n 5 'ps aux --sort=-%cpu | head -10'

# Continuous logging
$ vmstat 5 > /tmp/vmstat.log &
$ iostat -x 5 > /tmp/iostat.log &

# Threshold-based monitoring
$ while true; do
    load=$(uptime | awk '{print $NF}' | cut -d, -f1)
    if (( $(echo "$load > 2.0" | bc -l) )); then
        echo "$(date): High load: $load" >> /var/log/load.log
    fi
    sleep 60
done

Monitor with appropriate frequency and methods for the purpose.

🚫 Mistake 5: Panic During Troubleshooting

❌ Hasty Response

# When system is heavy, immediately
$ sudo killall -9 httpd     # Force kill everything
$ sudo reboot              # Reboot immediately

Forceful measures without investigation can worsen problems.

✅ Systematic Approach

# 1. Check situation
$ uptime                   # Check load
$ free -h                  # Check memory
$ df -h                    # Check disk

# 2. Identify problem
$ ps aux --sort=-%cpu | head -10  # Top CPU users
$ ps aux --sort=-%mem | head -10  # Top memory users

# 3. Gradual response
$ kill -TERM problematic_pid      # Try graceful termination first
$ sleep 5
$ ps -p problematic_pid           # Check status
# Additional measures as needed

Systematically analyze problems before responding.

🎯 Practical Professional Techniques

📊 Process Monitoring Automation

# Process health monitoring script
#!/bin/bash
PROCESS_NAME="nginx"
RESTART_CMD="sudo systemctl start nginx"

if ! pgrep "$PROCESS_NAME" > /dev/null; then
    echo "$(date): $PROCESS_NAME stopped. Restarting..." | logger
    $RESTART_CMD
fi

⚡ Performance Optimization

# Parallelize CPU-intensive tasks according to core count
cores=$(nproc)
for i in $(seq 1 $cores); do
    heavy_task.sh chunk_$i &
done
wait  # Wait for all processing to complete

🛡️ Safe Process Management

# Safe confirmation before process termination
safe_kill() {
    local pid=$1
    local timeout=${2:-10}

    # Check process exists
    if ! kill -0 "$pid" 2>/dev/null; then
        echo "Process $pid does not exist"
        return 1
    fi

    # Try graceful termination
    kill -TERM "$pid"

    # Wait for specified seconds
    for i in $(seq 1 "$timeout"); do
        if ! kill -0 "$pid" 2>/dev/null; then
            echo "Process $pid terminated gracefully"
            return 0
        fi
        sleep 1
    done

    # Force kill
    echo "Force killing process $pid"
    kill -KILL "$pid"
}

Best Practices

📊 Regular Monitoring

Regular system state checks with cron
Regular log file review
Track resource usage trends

🔧 Resource Limit Configuration

Set user limits with ulimit
Service resource limits with systemd
Detailed control with cgroups

🚨 Alert Configuration

CPU and memory usage thresholds
Disk capacity monitoring
Critical process health monitoring

Summary

Mastering practical process management skills enables stable system operations.

Key Points

Job control for efficient task management
nice/renice for resource priority adjustment
Monitoring tools for proactive problem discovery
Systematic approach to troubleshooting

Next Steps

📚 Related Learning Topics

Shell Scripting - Automation and task management
System Administration - Service management and maintenance
Network Monitoring - Network performance optimization
Security - Process-level security measures

📊 Complete Process Management Series

Basics - ps, top, kill fundamental operations
Practical (This Article) - Job control, nice, system monitoring

📢 About Affiliate Links

As an Amazon Associate, this site earns from qualifying purchases through product links. This is at no additional cost to you. Book recommendations are from Amazon.co.jp (Japan), chosen for their value to Japanese-speaking learners.

📚 Recommended Books for Process Management Practice & System Operations Learning

To efficiently master practical process management techniques and system operations know-how, we've carefully selected practice-focused specialized books. Deepen your knowledge of job control, monitoring, and troubleshooting techniques.

📚 ゼロからはじめるLinuxサーバー構築・運用ガイド第2版

Target Level: Beginner to Intermediate

Systematically learn practical server operations techniques including process management. A practical guide for acquiring skills that make you immediately effective in the field like job control, monitoring, and troubleshooting.

View on Amazon

📚 Linuxシステムプログラミング

Target Level: Intermediate to Advanced

Deep understanding of low-level APIs and system calls for process control (fork, exec, signal). A specialized book for practically learning job control internal implementation, process priorities, and resource management.

View on Amazon

📚 24時間365日サーバ/インフラを支える技術

Target Level: Intermediate to Advanced

Learn know-how for process monitoring, operation automation, and incident response in large-scale server environments from real examples. Master advanced operational techniques that make you immediately effective in real work including continuous monitoring, performance optimization, and troubleshooting.

View on Amazon

📚 システム運用アンチパターン

Target Level: Intermediate to Advanced

Learn common failure patterns in process management and effective solutions. Detailed explanation of best practices for troubleshooting, monitoring design, and operation automation with practical examples. A practical operations guide learning from failures.

View on Amazon

Process Management Practical: Advanced Control and Monitoring Techniques

Table of Contents

1. Job Control

Background Execution

Start Command in Background

Move Running Command to Background

Job Management Commands

List Jobs

Bring to Foreground

Resume in Background

Continue After Logout with nohup

💡 Practical Tips

2. nice/renice - Priority Management

Nice Value Range

Start Process with Low Priority

Change Running Process Priority

Execute with High Priority (root privileges)

3. System Monitoring Tools

Overall System Load Check

uptime - System Uptime and Load

vmstat - Virtual Memory Statistics

iostat - I/O Statistics

sar - System Activity Report

Process Detailed Information

/proc Filesystem

Files Opened by Process

Network Connections

4. Troubleshooting

Case 1: High CPU Usage

Diagnostic Procedure

Case 2: Memory Shortage

Diagnostic Procedure

Case 3: Zombie Processes

Resolution Method

Case 4: Unresponsive Process

Gradual Approach

📊 Simple Monitoring Script Example

⚠️ Common Mistakes in Practice and Pitfalls

🚫 Mistake 1: Misusing nohup

❌ Common Mistake

✅ Correct Usage

🚫 Mistake 2: Job Control Confusion

❌ Confusing Example

✅ Manageable Method

🚫 Mistake 3: Misunderstanding nice Values

❌ Incorrect Understanding

✅ Correct Understanding and Usage

🚫 Mistake 4: Excessive or Insufficient Monitoring

❌ Problematic Monitoring

✅ Appropriate Monitoring Methods

🚫 Mistake 5: Panic During Troubleshooting

❌ Hasty Response

✅ Systematic Approach

🎯 Practical Professional Techniques

📊 Process Monitoring Automation

⚡ Performance Optimization

🛡️ Safe Process Management

Best Practices

📊 Regular Monitoring

🔧 Resource Limit Configuration

🚨 Alert Configuration

Summary

Key Points

Next Steps

📚 Related Learning Topics

📊 Complete Process Management Series

🔗 Related Articles

📚 Recommended Books for Process Management Practice & System Operations Learning

📚 ゼロからはじめるLinuxサーバー構築・運用ガイド 第2版

📚 Linuxシステムプログラミング

📚 24時間365日 サーバ/インフラを支える技術

📚 システム運用アンチパターン

📚 ゼロからはじめるLinuxサーバー構築・運用ガイド第2版

📚 24時間365日サーバ/インフラを支える技術