AWK
Table of Contents
1 AWK and GAWK
Dependencies: In order to learn awk this can be necessary for you to learn before:
- Unix Pipes
- Shell Basics
- Data Streams
- Unix-like Operative Systems
- Before we start learning grep please create the following file named
animals.txt
Animals Quantity Dogs 32 Cats 17 Birds 25 Cows 7 Ducks 9 Pigs 12
1.1 Simple Description of AWK:
AWK is a programming language in UNIX, it is designed for text processing, but can be used for data extraction and reporting generation. As a programming language, the syntax of Awk would seem similar to the languages C, Python and Bash, among others.
(Note In this case GNU awk or GAWK will be covered, which is the GNU Project's implementation of AWK. End Note)
1.1.1 Typical uses of awk:
- Text processing.
- Data extraction.
- Formatted text reports.
- Arithmetic operations.
- String operations.
- Many more!.
1.2 Basic example
The first program we will write is very simple, it will just print "hello world!":
$ echo "hello world!" | awk {'print'} hello world
What just happened? awk is a pattern matcher, in this case it got an input "hello world" and a pattern 'print', and awk applies pattern to each line of the input. That is why we see 'hello world' as a result.
As we can see awk is a language that process an input file, in a nutshell this is what awk does:
- Gets input, a pattern to look for, and a rule to apply.
- Reads input.
- Looks first line of input looking for pattern.
- If a pattern is matched, then rule is applied.
- Moves to next line until End of File.
Of course, awk can get multiple patterns and rules, but all of them are applied sequentially, we can also change awk behaviour so it looks at patterns and not at lines.
1.3 Better examples
With awk we can print the content of a file with:
$ awk '{print $0}' fruits.txt Animals Quantity Dogs 32 Cats 17 Birds 25 Cows 7 Ducks 9 Pigs 12
We can also print only the first field of text with $1
$ awk '{print $1}' fruits.txt Animals Dogs Cats Birds Cows Ducks Pigs
We can also print only the second field of text with $2
$ awk '{print $2}' fruits.txt Quantity 32 17 25 7 9 12
What just happened? The "$0", "$1" and "$2" have a meaning similar to a shell script. Instead of the zero, first and second argument, they mean the entire current line, the first and second field of the input line, respectively.
1.4 What else can be done
Cool stuff that you can do now with this new knowledge.