Regular Expressions: Basic, Extended Regex and grep
What You Will Achieve
- Explain the difference between Basic Regular Expressions (BRE) and Extended (ERE)
- Write accurate patterns using anchors, character classes, and quantifiers
- Use regex appropriately with
grep/egrep/sed - Extract and exclude lines matching specific patterns from logs
- Stop making the exam-frequent "BRE backslash" mistake
This is the core of LPIC-1 objective 103.7 "Search text files using regular expressions". Regex is a language for describing string patterns and underlies grep / sed / awk.
Deciding Between BRE and ERE
Regex has dialects. LPIC asks about two POSIX kinds.
| Kind | Commands | Treatment of + ? { } ( ) | |
|---|---|---|
| Basic Regular Expression BRE | grep / sed |
Backslash required (\+ \{ \() |
| Extended Regular Expression ERE | grep -E / egrep / sed -E |
Used as-is (+ { () |
The difference "write \{3\} in grep but {3} in grep -E" is exam-frequent. Always be aware of which dialect you are writing.
Metacharacter Reference
| Metachar | Meaning | Example |
|---|---|---|
. |
Any single char | a.c → abc, axc |
* |
0 or more of preceding | ab* → a, ab, abb |
^ |
Start-of-line anchor | ^root |
$ |
End-of-line anchor | bash$ |
[ ] |
Character class | [0-9] one digit |
[^ ] |
Negated class | [^0-9] non-digit |
\+ (ERE +) |
1 or more of preceding | a\+ |
\? (ERE ?) |
0 or 1 of preceding | colou\?r |
\{n,m\} (ERE {n,m}) |
Between n and m times | [0-9]\{1,3\} |
| (OR / alternation) |
Match either side. BRE needs the backslash, ERE not | cat|dog |
Steps
Step 1: Pin position with anchors
grep '^#' /etc/ssh/sshd_config grep 'bash$' /etc/passwd grep -x 'root' users.txt
# $OpenBSD: sshd_config #Port 22 root:x:0:0:root:/root:/bin/bash root
^ is start of line and $ is end of line, zero-width anchors. grep -x matches the whole line (equivalent to ^pattern$). This is the basis of comment-line extraction and exact-match search.
Step 2: Specify ranges with character classes
grep '[0-9]\{1,3\}\.[0-9]\{1,3\}' access.log
grep -E '[0-9]{1,3}(\.[0-9]{1,3}){3}' access.log
grep '[[:space:]]' config.txt192.168.1.10 - - [17/May/2026] 10.0.0.5 - - [17/May/2026]
[0-9] is one digit and \{1,3\} is 1 to 3 repetitions (BRE requires backslashes). [[:space:]] is a POSIX character class matching whitespace.
Step 3: Use BRE and ERE appropriately
grep 'colou\?r' notes.txt grep -E 'colou?r' notes.txt grep -E 'error|warning|fatal' app.log
color theme favourite colour [ERROR] disk full [WARNING] high load
When using ? + | {} (), grep -E (ERE) is more readable. Doing the same in BRE requires backslash escaping and reduces readability.
Step 4: Combine extraction and exclusion
grep -v '^#' /etc/fstab | grep -v '^$'
grep -o '[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}' access.log | sort -u
grep -ci 'timeout' app.logUUID=xxxx / ext4 defaults 0 1 192.168.1.10 10.0.0.5 7
-v shows non-matching lines (strip comments/blank lines), -o extracts only the match, -c gives counts, -i ignores case. Frequently used to extract effective lines from config files.
Step 5: Replace with regex in sed
echo "2026-05-17" | sed -E 's/([0-9]{4})-([0-9]{2})-([0-9]{2})/\3\/\2\/\1/'
sed -n '/^ERROR/p' app.log17/05/2026 ERROR connection refused
sed -E is ERE mode. \1 \2 are backreferences that reuse groups captured with ( ) on the replacement side. Used for date format conversion and similar.
Why BRE Needs Backslashes
BRE historically preserves the old grep / ed syntax, treating + ? { ( | as "no special meaning (literal)". To use them as quantifiers, groups, or OR, you must mark "this is a metacharacter" with a backslash. ERE removes this clutter by treating special characters as metacharacters from the start. grep -E / egrep enable ERE.
Without understanding this design difference you cannot explain "grep 'a+' does not match one-or-more of a (it searches for literal a+)". LPIC asks exactly about this pitfall.
Troubleshooting
Symptom: grep 'a+' does not match one-or-more
Cause: In BRE, + is treated as a literal character
Check:
grep 'a\+' file grep -E 'a+' file
Fix: Use ERE (grep -E), or escape as \+ in BRE.
Symptom: A dot matches any character and causes false positives
Cause: . means any single character in regex
Check:
grep '192\.168' access.log
Fix: Escape as \. to search for a literal dot. For fixed-string search, grep -F (fgrep) is safe.
Symptom: Cannot search a string containing special characters
Cause: [ * $ etc. are interpreted as metacharacters
Check:
grep -F '[error]' app.log
Fix: If the pattern needs no regex, use grep -F for fixed-string search. Escape only the necessary characters with \.
Completion Checklist
- [ ] Verified start/end-of-line match with
^$anchors - [ ] Combined character classes like
[0-9]with quantifiers - [ ] Verified BRE (
\{3\}) vs ERE (grep -E '{3}') difference on a real machine - [ ] Tried the
-v-o-c-ioptions - [ ] Replaced using
sed -Ebackreferences (\1)
Summary
| Scenario | Syntax | Purpose |
|---|---|---|
| Start match | grep '^pattern' |
Comment/specific-line extraction |
| Exact match | grep -x / ^...$ |
Whole-line match |
| One or more | grep -E 'a+' |
Detect repeated chars |
| Match only | grep -o |
Get only the matched part |
| Replace | sed -E 's/.../.../' |
Reformat with backreferences |
Regex underlies grep / sed / awk. Next, move on to other exam areas such as link mechanisms and process priorities to connect the knowledge.