awk.org 3.4 KB

AWK and GAWK

    /Dependencies:/ In order to learn awk this can be necessary for you to learn before:
  • Unix Pipes
  • Shell Basics
  • Data Streams
  • Unix-like Operative Systems
  • Before we start learning grep please create the following file named animals.txt

Animals Quantity Dogs 32 Cats 17 Birds 25 Cows 7 Ducks 9 Pigs 1


Simple Description of AWK:

AWK is a programming language in UNIX, it is designed for text processing, but can be used for data extraction and reporting generation. As a programming language, the syntax of Awk would seem similar to the languages C, Python and Bash, among others.

(Note In this case GNU awk or GAWK will be covered, which is the GNU Project's implementation of AWK. End Note)

Typical uses of awk:

  • Text processing.
  • Data extraction.
  • Formatted text reports.
  • Arithmetic operations.
  • String operations.
  • Many more!.

Basic example

The first program we will write is very simple, it will just print "hello world!":


$ echo "hello world!" | awk {'print'}
hello world

*What just happened?* /awk is a pattern matcher, in this case it got an input "hello world" and a pattern 'print', and awk applies pattern to each line of the input. That is why we see 'hello world' as a result./

As we can see awk is a language that process an input file, in a nutshell this is what awk does:

  1. Gets input, a pattern to look for, and a rule to apply.
  2. Reads input.
  3. Looks first line of input looking for pattern.
  4. If a pattern is matched, then rule is applied.
  5. Moves to next line until End of File.

Of course, awk can get multiple patterns and rules, but all of them are applied sequentially, we can also change awk behaviour so it looks at patterns and not at lines.

Better examples

With awk we can print the content of a file with:


$ awk '{print $0}' fruits.txt 
Animals Quantity
Dogs    32
Cats    17
Birds   25
Cows    7
Ducks   9
Pigs   12

We can also print only the first field of text with $1


$ awk '{print $1}' fruits.txt 
Animals
Dogs
Cats
Birds
Cows
Ducks
Pigs

We can also print only the second field of text with $2


$ awk '{print $2}' fruits.txt 
Quantity
32
17
25
7
9
12

*What just happened?* /The "$0", "$1" and "$2" have a meaning similar to a shell script. Instead of the zero, first and second argument, they mean the entire current line, the first and second field of the input line, respectively./

What else can be done

Cool stuff that you can do now with this new knowledge.

Resources