An Awk Primer/Using Awk from the Command Line
The Awk programming language was designed to be simple but powerful. It allows a user to perform relatively sophisticated text-manipulation operations through Awk programs written on the command line.
For example, suppose I want to turn a document with single-spacing into a document with double-spacing. I could easily do that with the following Awk program:
awk '{print ; print ""}' infile > outfile
Notice how single-quotes (' ') are used to allow using double-quotes (" ") within the Awk expression. This "hides" special characters from the shell. We could also do this as follows:
awk "{print ; print \"\"}" infile > outfile
—but the single-quote method is simpler.
This program does what it supposed to, but it also doubles every blank line in the input file, which leaves a lot of empty space in the output. That's easy to fix, just tell Awk to print an extra blank line if the current line is not blank:
awk '{print ; if (NF != 0) print ""}' infile > outfile
- One of the problems with Awk is that it is ingenious enough to make a user want to tinker with it, and use it for tasks for which it isn't really appropriate. For example, we could use Awk to count the number of lines in a file:
awk 'END {print NR}' infile
—but this is dumb, because the "wc (word count)" utility gives the same answer with less bother: "Use the right tool for the job."
Awk is the right tool for slightly more complicated tasks. Once I had a file containing an email distribution list. The email addresses of various different groups were placed on consecutive lines in the file, with the different groups separated by blank lines. If I wanted to quickly and reliably determine how many people were on the distribution list, I couldn't use "wc", since, it counts blank lines, but Awk handled it easily:
awk 'NF != 0 {++count} END {print count}' list
- Another problem I ran into was determining the average size of a number of files. I was creating a set of bitmaps with a scanner and storing them on a disk. The disk started getting full and I was curious to know just how many more bitmaps I could store on the disk.
I could obtain the file sizes in bytes using "wc -c" or the "list" utility ("ls -l" or "ll"). A few tests showed that "ll" was faster. Since "ll"
lists the file size in the fifth field, all I had to do was sum up the fifth field and divide by NR. There was one slight problem, however: the first line of the output of "ll" listed the total number of sectors used, and had to be skipped.
No problem. I simply entered:
ll | awk 'NR!=1 {s+=$5} END {print "Average: " s/(NR-1)}'
This gave me the average as about 40 KB per file.
- Awk is useful for performing simple iterative computations for which a more sophisticated language like C might prove overkill. Consider the Fibonacci sequence:
1 1 2 3 5 8 13 21 34 ...
Each element in the sequence is constructed by adding the two previous elements together, with the first two elements defined as both "1". It's a discrete formula for exponential growth. It is very easy to use Awk to generate this sequence:
awk 'BEGIN {a=1;b=1; while(++x<=10){print a; t=a;a=a+b;b=t}; exit}'
This generates the following output data:
1 2 3 5 8 13 21 34 55 89