Awk program structure

Pattern1 { Actions; }
Pattern2 { Actions; }
Pattern3 { Actions; }

the input file will be feed into the program and it will run through each line e.g with the following input file

GET /posts/something.html
POST /posts/new
DELETE /posts/1

and the following awk command:

awk '/POST/ { print $2 }' inputfile.txt

It will output the POST line, but only the second column/field

/posts/new

This is very userful if you just want to output part of the string rather than grep it will output the matching line.

Fields

These are basically columns, by default its space is the seperator, but you can modify the sperator by using -F options - -F, will make it treat each comma as seperator.

John Smith,London,123
Tom Delon,Bristol,321
Katie Levi,Bath,543234
Liz Grant,Bath,12323

Above is a typical CSV file, awk have some predefined variable to reference the above fields, consider the above with -F, - a comma as seperator.

$0 = whole line
$1 = John Smith
$2 = Lodon
$3 = 123
etc

So to print out all the name will be

awk -F, '{print $2}' input_file_with_above_content.csv

See the pattern part can be omit, so it will run each line against the action. Pattern can be specified to find out the names that have second field/column as Bath

awk -F, '/Bath/ {print $1}' input_file_with_above_content.csv

This is very useful when you can pipe it into uniq command or wc command.

Patterns

Awk support regular expression - it will depend on what version of awk you have, but they should all support the most basic ones

/abc/ - match abc
/^abc/ - match line begining with abc
/abc$/ - match line end with abc
/(GET|POST|PUT|DELETE) - will match any of HTTP verbs

it also support boolean expressions and operators

&& - and
|| - or
! - not

== - like php fuzzy match e.g "23" == 23 is true
!= - not equal
> , < etc

~ - regex match for variable e.g: string ~ /regex/
!~ - regex not match for string or variable

all these can be combine

/POST/ || debug == true

Will match POST or when the debug variable is true

Actions

Action statements are enclosed by curly bracket - { Action; Action; Action } There are tons awk actions, here are some common ones

{ print $0 } - print $0, aka the whole line, or the $0 can be omit
{ exit; } - exit the program
{ next; } - skips to the next line of input
{ a=$1; b=$0 } - variable assignment
{ a[$1] = $3 } - array assignment

Conditional:

{ if (expression/boolean) { Action }
	else if (express/boolean) { Action }
	else { Action }
}

Loops:

{ for (item in a) { Action } }
{ for (i=1; i < x;i++) { Action } }

As you can see from the actions, it resemble a tiny programming language, but all variables in awk are global, so be careful.

Functions

They just like normal functions in any C like language

function name(value) {
	Actions;
}

and it can be call inside the Action part

{ name($1) }

Special Variables

These are special variables, some can be modify and some can’t, see MAN page for more detail