find, grep, and awk Exercises - Troubleshooting and Practical Skills

The final installment of the series. Includes exercises for skill verification, common troubleshooting solutions, and a comprehensive skill development roadmap. Complete your journey to becoming a Linux expert.

📋 Table of Contents

  1. Common Problems and Solutions
  2. Further Skill Development
  3. Exercises and Challenges
  4. Conclusion: First Step to Linux Mastery

9. Common Problems and Solutions

Learn the typical problems and their solutions you'll inevitably encounter in real-world usage.

🔍 find Command Issues

❌ Massive "Permission denied" errors

Symptom:

find: '/root': Permission denied find: '/proc/1': Permission denied ...
✅ Solutions
Method 1: Ignore error output
find / -name "*.txt" 2>/dev/null
Method 2: Search only accessible locations
find /home /var /tmp -name "*.txt"
Method 3: Run with sudo (use with caution)
sudo find / -name "*.txt"

❌ Errors with spaces in filenames

Symptom:

find /home -name "*.txt" -exec rm {} \; # Error: "My Document.txt" interpreted as "My", "Document.txt"
✅ Solutions
Use -print0 with xargs -0
find /home -name "*.txt" -print0 | xargs -0 rm
Use -exec with +
find /home -name "*.txt" -exec rm {} +

❌ Search is too slow

✅ Solutions
  • Skip unnecessary directories with -path
  • Limit search depth with -maxdepth
  • Restrict to files only with -type f
find /var -maxdepth 3 -type f -path "*/node_modules" -prune -o -name "*.log" -print

🔎 grep Command Issues

❌ Japanese (multibyte characters) not searched correctly

✅ Solutions
Check and modify locale settings
export LANG=ja_JP.UTF-8 grep "エラー" logfile.txt
Avoid binary file treatment
grep -a "エラー" logfile.txt

❌ Regular expressions not working as expected

Common issues:

  • +, ?, {} treated as literal characters
  • Grouping with () doesn't work
✅ Solutions
Use extended regex with -E
grep -E "colou?r" file.txt # ? works correctly grep -E "(http|https)://" file.txt # grouping works
Use egrep alias
egrep "colou?r" file.txt

❌ "Binary file matches" error

Symptom:

Binary file image.jpg matches
✅ Solutions
Search text files only
grep -I "pattern" * # Skip binary files
Limit file types
grep -r --include="*.txt" --include="*.log" "pattern" .

⚙️ awk Command Issues

❌ Fields not split as expected

Symptom: Commas within CSV fields cause issues

"田中太郎","28","東京都渋谷区","エンジニア,チームリーダー"
✅ Solutions
Use dedicated tools
csvtool col 1,2 data.csv # Use csvtool command
Integrate with Python
python3 -c " import csv, sys reader = csv.reader(sys.stdin) for row in reader: print(row[0], row[1]) " < data.csv

❌ Precision loss in numeric calculations

Symptom: Inaccurate decimal point calculation results

awk '{sum+=$1} END {print sum}' # Expected: 10.50, Actual: 10.5000000001
✅ Solutions
Specify precision with printf
awk '{sum+=$1} END {printf "%.2f\n", sum}' numbers.txt
Integrate with bc command
awk '{print $1}' numbers.txt | paste -sd+ | bc

🔧 Debug Techniques

🎯 Step-by-step Verification

Execute complex commands partially to verify

# Final command find /var/log -name "*.log" | xargs grep -l "ERROR" | xargs wc -l # Debug procedure # 1. Execute find part only find /var/log -name "*.log" # 2. Execute up to grep part find /var/log -name "*.log" | xargs grep -l "ERROR" # 3. Execute full command find /var/log -name "*.log" | xargs grep -l "ERROR" | xargs wc -l

📝 Save Intermediate Results

Save intermediate results to files for time-consuming processes

# Process while saving intermediate results find /var -name "*.log" > all_logs.txt grep -l "ERROR" $(cat all_logs.txt) > error_logs.txt wc -l $(cat error_logs.txt) > final_result.txt

10. Further Skill Development

Learn the next skills to master and career paths after mastering find, grep, and awk.

🎯 Next Level Commands

📊 sed (Stream Editor)

High-speed text replacement, deletion, and insertion

Learning Priority:⭐⭐⭐⭐⭐

sed 's/error/ERROR/g' logfile.txt

Batch replace all "error" with "ERROR"

🔄 xargs (Argument Conversion)

Convert pipe output to command line arguments

Learning Priority:⭐⭐⭐⭐⭐

find . -name "*.txt" | xargs -P 4 wc -l

Parallel processing for speed

🗃️ sort/uniq (Sort/Deduplication)

Data sorting and duplicate handling

Learning Priority:⭐⭐⭐⭐☆

cat access.log | awk '{print $1}' | sort | uniq -c | sort -rn

Sort by access count

🔗 join/paste (File Joining)

Combine data from multiple files

Learning Priority:⭐⭐⭐☆☆

join -t, file1.csv file2.csv

Join CSV files by common key

🚀 Skill Development Roadmap

📈 Level 1: Command Mastery (Current Position)

Achieved: Find, grep, awk from basics to advanced

Market Value: +¥1.5M salary increase level

📈 Level 2: Shell Scripting

Goal: Automation scripts combining multiple commands

Study Period: 2-3 months

Market Value: +¥3M salary increase level

Bash scripting Conditionals & Loops Error Handling

📈 Level 3: System Administration & Automation

Goal: Server operations automation

Study Period: 4-6 months

Market Value: +¥5M salary increase level

cron/systemd Log Analysis Automation Monitoring Scripts

📈 Level 4: DevOps & Cloud

Goal: CI/CD, infrastructure automation

Study Period: 6-12 months

Market Value: +¥7M salary increase level

Docker Kubernetes AWS/GCP Terraform

📚 Recommended Learning Resources

🐧 Continuous Learning

Penguin Gym Linux Advanced Course

Practice environment for learned commands. Improve with advanced challenges

Continue Practice →

📖 Books & Materials

"Shell Script Rapid Development Techniques"

Practical shell scripting methods

"Introduction to Monitoring"

From monitoring basics to practice

🏅 Certifications

LPIC Level 1

Objective proof of basic Linux skills

AWS Solutions Architect

Cloud infrastructure design skill certification

💼 Career Path Examples

🛠️ Infrastructure Engineer Track

Current Skills + System Administration
Server Construction & Operations
Cloud Infrastructure Design
SRE & Infrastructure Architect

Target Salary: ¥8-12M

📊 Data Engineer Track

Current Skills + Data Processing
ETL & Data Pipelines
Big Data Infrastructure
Data Architect

Target Salary: ¥7.5-11M

🔄 DevOps Engineer Track

Current Skills + CI/CD
Automation & Container Tech
Kubernetes Operations
Platform Engineer

Target Salary: ¥9-13M

11. 🎯 Exercises and Challenges: Skill Verification

Not just theory—practice hands-on to solidify your skills. Work through these exercises to verify your abilities.

🟢 Beginner Challenges

Verify basic command usage.

Challenge 1: Basic File Search

Task: Find files under /var/log directory with .log extension and size 1MB or larger.

💡 Hint

Combine -name and -size options with find command

🎯 Solution find /var/log -name "*.log" -size +1M

Challenge 2: Basic Text Search

Task: Search for lines containing "ERROR" in system.log file, displaying with line numbers.

🎯 Solution grep -n "ERROR" system.log

Challenge 3: Basic Data Aggregation

Task: Calculate the sum of the 3rd column (sales) in sales.csv file.

🎯 Solution awk -F',' '{sum += $3} END {print "Total:", sum}' sales.csv

🟡 Intermediate Challenges

Practical problems combining multiple commands.

Challenge 4: Log Analysis Pipeline

Task: Count unique IP addresses from today's access log.

💡 Hint

grep for today's date → awk to extract IPs → sort/uniq for deduplication

🎯 Solution grep "$(date '+%d/%b/%Y')" access.log | awk '{print $1}' | sort -u | wc -l

Challenge 5: Large File Search

Task: Find and display the top 5 largest files (100MB+) in home directory, sorted by size.

🎯 Solution find /home -type f -size +100M -exec ls -lh {} \; | sort -rh -k5 | head -5

Challenge 6: Error Statistics Report

Task: Aggregate error counts by type from multiple log files, displaying in descending order.

🎯 Solution find /var/log -name "*.log" | xargs grep -h "ERROR" | awk '{print $4}' | sort | uniq -c | sort -rn

🔴 Advanced Challenges

Practical problems requiring advanced techniques and creativity.

Challenge 7: Website Monitoring Script

Task: Identify IP addresses with 10+ 404 errors in the past hour from Apache access log and generate alert messages.

💡 Hint

Time filter → 404 error extraction → IP aggregation → threshold judgment

🎯 Solution # Get time 1 hour ago hour_ago=$(date -d '1 hour ago' '+%d/%b/%Y:%H') current_hour=$(date '+%d/%b/%Y:%H') # Detect IPs with many 404 errors grep -E "($hour_ago|$current_hour)" /var/log/apache2/access.log | \ grep " 404 " | \ awk '{print $1}' | \ sort | uniq -c | \ awk '$1 >= 10 {printf "ALERT: IP %s has %d 404 errors in last hour\n", $2, $1}'

Challenge 8: Data Quality Check

Task: Create a script to check CSV file data quality, reporting:
- Total rows and columns
- Number of blank lines
- Unique value counts per column
- Max, min, and average for numeric columns

🎯 Solution awk -F',' ' BEGIN { print "=== CSV Data Quality Report ===" } NR == 1 { # Process header row num_columns = NF for (i = 1; i <= NF; i++) { headers[i] = $i } next } NF == 0 { empty_lines++ next } { total_rows++ # Process each column for (i = 1; i <= num_columns && i <= NF; i++) { field_values[i][$i] = 1 # Numeric check if ($i ~ /^[0-9]+\.?[0-9]*$/) { numeric_values[i][++numeric_count[i]] = $i numeric_sum[i] += $i if (numeric_min[i] == "" || $i < numeric_min[i]) numeric_min[i] = $i if (numeric_max[i] == "" || $i > numeric_max[i]) numeric_max[i] = $i } } } END { printf "Total rows: %d\n", total_rows printf "Columns: %d\n", num_columns printf "Blank lines: %d\n", empty_lines + 0 print "" for (i = 1; i <= num_columns; i++) { printf "Column %d (%s):\n", i, headers[i] printf " Unique values: %d\n", length(field_values[i]) if (numeric_count[i] > 0) { avg = numeric_sum[i] / numeric_count[i] printf " Numeric stats: min=%.2f, max=%.2f, avg=%.2f\n", numeric_min[i], numeric_max[i], avg } print "" } }' data.csv

Challenge 9: Automated Backup Script

Task: Create an automated backup script for important files with these features:
- Only files updated since last backup
- Files under 100MB only
- Backup process logging
- Automatic deletion of old backups (7+ days)

🎯 Solution #!/bin/bash BACKUP_DIR="/backup/$(date +%Y%m%d_%H%M%S)" LAST_BACKUP_MARKER="/var/log/last_backup.timestamp" LOG_FILE="/var/log/backup.log" echo "=== Backup started at $(date) ===" >> "$LOG_FILE" # Create backup directory mkdir -p "$BACKUP_DIR" # Get last backup timestamp if [[ -f "$LAST_BACKUP_MARKER" ]]; then LAST_BACKUP=$(cat "$LAST_BACKUP_MARKER") echo "Last backup: $LAST_BACKUP" >> "$LOG_FILE" else LAST_BACKUP="1970-01-01" fi # Find and backup updated files find /home/important -type f -size -100M -newer "$LAST_BACKUP_MARKER" 2>/dev/null | \ while read file; do # Backup with relative path rel_path="${file#/home/important/}" backup_path="$BACKUP_DIR/$rel_path" backup_dir=$(dirname "$backup_path") mkdir -p "$backup_dir" if cp "$file" "$backup_path" 2>/dev/null; then echo "Backed up: $file" >> "$LOG_FILE" ((backed_up_count++)) else echo "Failed to backup: $file" >> "$LOG_FILE" fi done # Delete old backups find /backup -type d -mtime +7 -exec rm -rf {} + 2>/dev/null echo "Old backups cleaned up" >> "$LOG_FILE" # Update timestamp date > "$LAST_BACKUP_MARKER" echo "=== Backup completed. Files backed up: ${backed_up_count:-0} ===" >> "$LOG_FILE"

🏆 Master Challenge

Professional-level problem. Proves you're ready for real-world production work.

Challenge 10: Comprehensive System Monitoring Dashboard

Final Challenge: Create a system monitoring script with these features:

  • Real-time log file monitoring
  • Automatic alerts on error detection
  • System resource usage visualization
  • Automatic daily report generation
  • Web-based viewing (HTML report generation)
💡 Approach Hints

tail -f for real-time monitoring, awk for statistics, find for old file management, HTML template for report generation

🎯 Achievement Reward: If you can complete this challenge, you are definitely a Linux expert!

🤝 Learning Support

12. 🎉 Conclusion: First Step to Linux Mastery

Throughout this 4-part series, we've covered find, grep, and awk commands in detail from fundamentals to practical applications. Mastering these commands makes you a true Linux expert.

🏆 Skills Acquired

✅ find - High-speed file and directory search with any criteria
✅ grep - Advanced text search with regular expressions
✅ awk - Data processing, aggregation, and report generation
✅ Effective combination of all three commands
✅ Performance optimization and troubleshooting
✅ Industry-specific practical examples and real-world skills
✅ Exercise challenges and skill verification

📚 Series Review

Basics

Command overview and regex fundamentals

Advanced

Master ultimate grep and awk techniques

Practical

Combination techniques and real-world use cases

Professional (Completed)

Final mastery with exercises and troubleshooting

📊 Expected Impact

90%
Work Efficiency Improvement
+¥3M
Salary Increase Potential
95%
Task Automation Rate

🚀 Practice Now

Knowledge alone isn't enough. Skills develop only through hands-on practice.

📅 Future Learning Plan

  • Week 1: Attempt Challenges 1-3 (verify basics)
  • Week 2: Attempt Challenges 4-6 (practical combinations)
  • Week 3: Attempt Challenges 7-9 (advanced techniques)
  • Week 4: Attempt Master Challenge (comprehensive test)
  • Ongoing: Apply commands in daily work

💪 You Are Now a Linux Expert

Having understood and practiced this 4-part series, you are now a Linux expert. You've mastered advanced techniques that many engineers don't know. Be confident and continue advancing your skills.

🎓 Congratulations! Your journey to Linux mastery is making solid progress.