Bash Regular Expression Cheatsheet¶
grep
vs grep -E
¶
The difference between grep and grep -E is that grep uses basic regular expressions while grep -E uses extended regular expressions. In basic regular expressions, the characters “?”, “+”, “{”, “|”, “(“,”)” lose their special meaning; instead, use “?”, “+”, “{”, “|”, “(”, “)”.
$ echo "987-123-4567" | grep "^[0-9]\{3\}-[0-9]\{3\}-[0-9]\{4\}$"
$ echo "987-123-4567" | grep -E "^[0-9]{3}-[0-9]{3}-[0-9]{4}$"
tr Substitutes Strings¶
# tr substitutes white spaces to newline
$ echo "a b c" | tr "[:space:]+" "\n"
a
b
c
# tr spueeze multiple spaces
$ echo "a b c" | tr -s " "
a b c
uniq Filters out Repeated Lines¶
$ echo "a a b b c" | tr " " "\n" | sort | uniq
a
b
c
# display count
$ echo "a a b b a c" | tr " " "\n" | sort | uniq -c
3 a
2 b
1 c
Note that uniq
only filters out lines continuously. However, if characters
are equal but they does not appear continually, uniq
does not squeeze them.
Therefore, a programmer needs to use sort
to categorizes lines before
uniq
.
$ echo "a a b b a c" | tr " " "\n" | uniq
a
b
a
c
sort
lines¶
# sort by lines
$ echo "b a c d" | tr " " "\n" | sort
# sort by lines reversely
$ echo "b a c d" | tr " " "\n" | sort -r
# sort by a field
$ echo "b a b c d" | tr " " "\n" | sort | uniq -c | sort -k1