Grep
Table of Contents
1 GREP
Dependencies: In order to learn grep this can be necessary for you to learn before:
- Unix Pipes
- Shell Basics
- Data Streams
- Unix-like Operative Systems
1.1 Simple description of Grep:
Grep is a UNIX command for searching text that matches a string.
If we want to look for "text" in $file then grep is exactly the tool we need!
It can also be used with the UNIX pipeline |
- Before we start learning grep please create the following file named
example_file.txt
example_file.txt The Apple is red The orange is old The pear is tasty The date of the party The dragon likes to eat fruit I don't know what dewberry is My berries are too old for $FILE *bold* \path\to\file 100.00 52.34 Up here^ [some stuff]
1.2 Basic example
$ grep orange example_file.txt The orange is old
That simple command is the way to use grep: You call grep with two arguments, first one is the text you want to find, second is where you want to find it. Of course that is very simple, so we should look at better examples.
1.3 Better examples
Now that we have seen grep in action is time to put it to better use, after all grep is one of the most powerful and useful commands in the UNIX world.
First of all you should notice that grep is case sensitive. This means that if you call $grep foo file.txt
grep will only find "foo" but not "Foo".
- Case insensitive search can be done with the flag
-i
$ grep -i apple example_file.txt The Apple is red
It is common in the UNIX world to have the need to look for meta characters ^$\.*[]
- Meta characters can be find with the special flag
-F
$ grep -F '$FILE' example_file.txt for $FILE
- Another common case might be the need for looking for two or more different strings. We can have different options
here:
- Match any of the strings: use
-e
flag
$ grep -e 'foo' -e 'bar'
- Match all of the strings: use a pipeline
|
to stream from one grep to another
$ grep 'foo' | grep 'bar'
- Sometimes you are looking for a pattern for example, you have a bunch of names file01.txt file02.txt and so on, in this case a period
.
can be used as a wildcard that matches exactly one character
$ls file01 file02 file03 file001 $ ls | grep file.1 file01
-Notice that file001 was not matched, since .
stands for only one character, if you need more characters just add more .
one for each character
$ls file01 file02 file03 file001 $ ls | grep file..1 file001
Now that we know how to use the widcard .
you might be wondering what happens if you are looking for a file with a period like file.txt.
- In this case the correct command is
grep 'file\.txt'
using both'
and\
, otherwise grep will match also file1txt file-txt fileatxt and so on:
$ls file.txt file1txt file-txt fileatxt $ ls | grep 'file\.txt ' file.txt
1.3.1 Table for grep commands
Metacharacter | Function | Example | Descrpition |
---|---|---|---|
^ | Beginning-of-line anchor | '^Up' |
Will display all lines beginning with Up |
$ | End-of-line anchor | 'old'$ |
Will display all lines ending with old |
. | Matches single character | a..e |
Will display lines containing a followed by two characters, followed y an e |
* | Matches zero or more characters preceding the asterisk | 'too*' |
Will display lines with 'to', or 'too' because 'to' is a zero match character and 'too' is a one match character |
[ ] | Matches single character in the set | '[Aa]pple' |
Will display lines containing Apple or apple |
[^] | Matches single character not in the set | '[^Tt]he' |
Will display lines not containing a character T or t followed by he but it will display all the other lines with he |
\< | Beginning-of-word anchor | '\<date' |
Will display lines containing a word that begins with "date" |
\> | End-of-word anchor | 'ear\>' |
Will display lines containing a word that ends with "ear" |
1.4 What else can be done
Cool stuff that you can do now with this new knowledge.
A pretty basic use of the pipeline with grep is explained next:
$ grep -v "e$" example_file.txt | grep "^d"
The first command grep -v "e$" example_file.txt
matches all lines ending in "e"
. The "-v"
flag means omit all
matches, thus the matching "e$"
lines don't show up on stdout.
(Note In UNIX $
represents the end of a line coversely ^
is the beginning of a line End Note)
The result of this command on stdout is (as said before, this won't be shown to user):
pear dragonfruit dewberry berries
Which is then piped into the second command, grep "^d"
. Just like how having "$"
next to "e"
meant match e
when it's next to the end of the line, having "^"
next to "d"
meants match d when it's next to the start of the line.
So, the final ouput would be this:
dragonfruit dewberry
1.5 Exercise
exercise.txt: exercise.sh:
if $(whereis wget) then wget komprendo.net/x/x/exercise.txt else curl komprendo.net/x/x/exercise.txt > exercise.txt fi
1.6 Resources
info grep (better content than man grep)