find/grep/awk Master Series Professional
Exercises and Troubleshooting
The final installment of the series. Includes exercises for skill verification, common troubleshooting solutions, and a comprehensive skill development roadmap. Complete your journey to becoming a Linux expert.
π Table of Contents
9. Common Problems and Solutions
Learn the typical problems and their solutions you'll inevitably encounter in real-world usage.
π find Command Issues
β Massive "Permission denied" errors
Symptom:
find: '/root': Permission denied
find: '/proc/1': Permission denied
...
β Solutions
Method 1: Ignore error output
find / -name "*.txt" 2>/dev/null
Method 2: Search only accessible locations
find /home /var /tmp -name "*.txt"
Method 3: Run with sudo (use with caution)
sudo find / -name "*.txt"
β Errors with spaces in filenames
Symptom:
find /home -name "*.txt" -exec rm {} \;
# Error: "My Document.txt" interpreted as "My", "Document.txt"
β Solutions
Use -print0 with xargs -0
find /home -name "*.txt" -print0 | xargs -0 rm
Use -exec with +
find /home -name "*.txt" -exec rm {} +
β Search is too slow
β Solutions
- Skip unnecessary directories with
-path - Limit search depth with
-maxdepth - Restrict to files only with
-type f
find /var -maxdepth 3 -type f -path "*/node_modules" -prune -o -name "*.log" -print
π grep Command Issues
β Japanese (multibyte characters) not searched correctly
β Solutions
Check and modify locale settings
export LANG=ja_JP.UTF-8
grep "γ¨γ©γΌ" logfile.txt
Avoid binary file treatment
grep -a "γ¨γ©γΌ" logfile.txt
β Regular expressions not working as expected
Common issues:
+,?,{}treated as literal characters- Grouping with
()doesn't work
β Solutions
Use extended regex with -E
grep -E "colou?r" file.txt # ? works correctly
grep -E "(http|https)://" file.txt # grouping works
Use egrep alias
egrep "colou?r" file.txt
β "Binary file matches" error
Symptom:
Binary file image.jpg matches
β Solutions
Search text files only
grep -I "pattern" * # Skip binary files
Limit file types
grep -r --include="*.txt" --include="*.log" "pattern" .
βοΈ awk Command Issues
β Fields not split as expected
Symptom: Commas within CSV fields cause issues
"η°δΈε€ͺι","28","ζ±δΊ¬ι½ζΈθ°·εΊ","γ¨γ³γΈγγ’,γγΌγ γͺγΌγγΌ"
β Solutions
Use dedicated tools
csvtool col 1,2 data.csv # Use csvtool command
Integrate with Python
python3 -c "
import csv, sys
reader = csv.reader(sys.stdin)
for row in reader: print(row[0], row[1])
" < data.csv
β Precision loss in numeric calculations
Symptom: Inaccurate decimal point calculation results
awk '{sum+=$1} END {print sum}'
# Expected: 10.50, Actual: 10.5000000001
β Solutions
Specify precision with printf
awk '{sum+=$1} END {printf "%.2f\n", sum}' numbers.txt
Integrate with bc command
awk '{print $1}' numbers.txt | paste -sd+ | bc
π§ Debug Techniques
π― Step-by-step Verification
Execute complex commands partially to verify
# Final command
find /var/log -name "*.log" | xargs grep -l "ERROR" | xargs wc -l
# Debug procedure
# 1. Execute find part only
find /var/log -name "*.log"
# 2. Execute up to grep part
find /var/log -name "*.log" | xargs grep -l "ERROR"
# 3. Execute full command
find /var/log -name "*.log" | xargs grep -l "ERROR" | xargs wc -l
π Save Intermediate Results
Save intermediate results to files for time-consuming processes
# Process while saving intermediate results
find /var -name "*.log" > all_logs.txt
grep -l "ERROR" $(cat all_logs.txt) > error_logs.txt
wc -l $(cat error_logs.txt) > final_result.txt
10. Further Skill Development
Learn the next skills to master and career paths after mastering find, grep, and awk.
π― Next Level Commands
π Skill Development Roadmap
π Level 1: Command Mastery (Current Position)
Achieved: Find, grep, awk from basics to advanced
Market Value: +Β₯1.5M salary increase level
π Level 2: Shell Scripting
Goal: Automation scripts combining multiple commands
Study Period: 2-3 months
Market Value: +Β₯3M salary increase level
π Level 3: System Administration & Automation
Goal: Server operations automation
Study Period: 4-6 months
Market Value: +Β₯5M salary increase level
π Level 4: DevOps & Cloud
Goal: CI/CD, infrastructure automation
Study Period: 6-12 months
Market Value: +Β₯7M salary increase level
π Recommended Learning Resources
π§ Continuous Learning
Penguin Gym Linux Advanced Course
Practice environment for learned commands. Improve with advanced challenges
Continue Practice βπ Books & Materials
"Shell Script Rapid Development Techniques"
Practical shell scripting methods
"Introduction to Monitoring"
From monitoring basics to practice
π Certifications
LPIC Level 1
Objective proof of basic Linux skills
AWS Solutions Architect
Cloud infrastructure design skill certification
πΌ Career Path Examples
π οΈ Infrastructure Engineer Track
Target Salary: Β₯8-12M
π Data Engineer Track
Target Salary: Β₯7.5-11M
π DevOps Engineer Track
Target Salary: Β₯9-13M
11. π― Exercises and Challenges: Skill Verification
Not just theoryβpractice hands-on to solidify your skills. Work through these exercises to verify your abilities.
π’ Beginner Challenges
Verify basic command usage.
Challenge 1: Basic File Search
Task: Find files under /var/log directory with .log extension and size 1MB or larger.
π‘ Hint
Combine -name and -size options with find command
π― Solution
find /var/log -name "*.log" -size +1M
Challenge 2: Basic Text Search
Task: Search for lines containing "ERROR" in system.log file, displaying with line numbers.
π― Solution
grep -n "ERROR" system.log
Challenge 3: Basic Data Aggregation
Task: Calculate the sum of the 3rd column (sales) in sales.csv file.
π― Solution
awk -F',' '{sum += $3} END {print "Total:", sum}' sales.csv
π‘ Intermediate Challenges
Practical problems combining multiple commands.
Challenge 4: Log Analysis Pipeline
Task: Count unique IP addresses from today's access log.
π‘ Hint
grep for today's date β awk to extract IPs β sort/uniq for deduplication
π― Solution
grep "$(date '+%d/%b/%Y')" access.log | awk '{print $1}' | sort -u | wc -l
Challenge 5: Large File Search
Task: Find and display the top 5 largest files (100MB+) in home directory, sorted by size.
π― Solution
find /home -type f -size +100M -exec ls -lh {} \; | sort -rh -k5 | head -5
Challenge 6: Error Statistics Report
Task: Aggregate error counts by type from multiple log files, displaying in descending order.
π― Solution
find /var/log -name "*.log" | xargs grep -h "ERROR" | awk '{print $4}' | sort | uniq -c | sort -rn
π΄ Advanced Challenges
Practical problems requiring advanced techniques and creativity.
Challenge 7: Website Monitoring Script
Task: Identify IP addresses with 10+ 404 errors in the past hour from Apache access log and generate alert messages.
π‘ Hint
Time filter β 404 error extraction β IP aggregation β threshold judgment
π― Solution
# Get time 1 hour ago
hour_ago=$(date -d '1 hour ago' '+%d/%b/%Y:%H')
current_hour=$(date '+%d/%b/%Y:%H')
# Detect IPs with many 404 errors
grep -E "($hour_ago|$current_hour)" /var/log/apache2/access.log | \
grep " 404 " | \
awk '{print $1}' | \
sort | uniq -c | \
awk '$1 >= 10 {printf "ALERT: IP %s has %d 404 errors in last hour\n", $2, $1}'
Challenge 8: Data Quality Check
Task: Create a script to check CSV file data quality, reporting:
- Total rows and columns
- Number of blank lines
- Unique value counts per column
- Max, min, and average for numeric columns
π― Solution
awk -F',' '
BEGIN {
print "=== CSV Data Quality Report ==="
}
NR == 1 {
# Process header row
num_columns = NF
for (i = 1; i <= NF; i++) {
headers[i] = $i
}
next
}
NF == 0 {
empty_lines++
next
}
{
total_rows++
# Process each column
for (i = 1; i <= num_columns && i <= NF; i++) {
field_values[i][$i] = 1
# Numeric check
if ($i ~ /^[0-9]+\.?[0-9]*$/) {
numeric_values[i][++numeric_count[i]] = $i
numeric_sum[i] += $i
if (numeric_min[i] == "" || $i < numeric_min[i]) numeric_min[i] = $i
if (numeric_max[i] == "" || $i > numeric_max[i]) numeric_max[i] = $i
}
}
}
END {
printf "Total rows: %d\n", total_rows
printf "Columns: %d\n", num_columns
printf "Blank lines: %d\n", empty_lines + 0
print ""
for (i = 1; i <= num_columns; i++) {
printf "Column %d (%s):\n", i, headers[i]
printf " Unique values: %d\n", length(field_values[i])
if (numeric_count[i] > 0) {
avg = numeric_sum[i] / numeric_count[i]
printf " Numeric stats: min=%.2f, max=%.2f, avg=%.2f\n", numeric_min[i], numeric_max[i], avg
}
print ""
}
}' data.csv
Challenge 9: Automated Backup Script
Task: Create an automated backup script for important files with these features:
- Only files updated since last backup
- Files under 100MB only
- Backup process logging
- Automatic deletion of old backups (7+ days)
π― Solution
#!/bin/bash
BACKUP_DIR="/backup/$(date +%Y%m%d_%H%M%S)"
LAST_BACKUP_MARKER="/var/log/last_backup.timestamp"
LOG_FILE="/var/log/backup.log"
echo "=== Backup started at $(date) ===" >> "$LOG_FILE"
# Create backup directory
mkdir -p "$BACKUP_DIR"
# Get last backup timestamp
if [[ -f "$LAST_BACKUP_MARKER" ]]; then
LAST_BACKUP=$(cat "$LAST_BACKUP_MARKER")
echo "Last backup: $LAST_BACKUP" >> "$LOG_FILE"
else
LAST_BACKUP="1970-01-01"
fi
# Find and backup updated files
find /home/important -type f -size -100M -newer "$LAST_BACKUP_MARKER" 2>/dev/null | \
while read file; do
# Backup with relative path
rel_path="${file#/home/important/}"
backup_path="$BACKUP_DIR/$rel_path"
backup_dir=$(dirname "$backup_path")
mkdir -p "$backup_dir"
if cp "$file" "$backup_path" 2>/dev/null; then
echo "Backed up: $file" >> "$LOG_FILE"
((backed_up_count++))
else
echo "Failed to backup: $file" >> "$LOG_FILE"
fi
done
# Delete old backups
find /backup -type d -mtime +7 -exec rm -rf {} + 2>/dev/null
echo "Old backups cleaned up" >> "$LOG_FILE"
# Update timestamp
date > "$LAST_BACKUP_MARKER"
echo "=== Backup completed. Files backed up: ${backed_up_count:-0} ===" >> "$LOG_FILE"
π Master Challenge
Professional-level problem. Proves you're ready for real-world production work.
Challenge 10: Comprehensive System Monitoring Dashboard
Final Challenge: Create a system monitoring script with these features:
- Real-time log file monitoring
- Automatic alerts on error detection
- System resource usage visualization
- Automatic daily report generation
- Web-based viewing (HTML report generation)
π‘ Approach Hints
tail -f for real-time monitoring, awk for statistics, find for old file management, HTML template for report generation
π― Achievement Reward: If you can complete this challenge, you are definitely a Linux expert!
π€ Learning Support
12. π Conclusion: First Step to Linux Mastery
Throughout this 4-part series, we've covered find, grep, and awk commands in detail from fundamentals to practical applications. Mastering these commands makes you a true Linux expert.
π Skills Acquired
π Series Review
π Expected Impact
πͺ You Are Now a Linux Expert
Having understood and practiced this 4-part series, you are now a Linux expert. You've mastered advanced techniques that many engineers don't know. Be confident and continue advancing your skills.