Awk
Awk program structure
Pattern1 { Actions; }
Pattern2 { Actions; }
Pattern3 { Actions; }
the input file will be feed into the program and it will run through each line e.g with the following input file
GET /posts/something.html
POST /posts/new
DELETE /posts/1
and the following awk command:
awk '/POST/ { print $2 }' inputfile.txt
It will output the POST line, but only the second column/field
/posts/new
This is very userful if you just want to output part of the string rather than grep it will output the matching line.
Fields
These are basically columns, by default its space is the seperator, but you can modify the sperator by using -F options - -F, will make it treat each comma as seperator.
John Smith,London,123
Tom Delon,Bristol,321
Katie Levi,Bath,543234
Liz Grant,Bath,12323
Above is a typical CSV file, awk have some predefined variable to reference the above fields, consider the above with -F, - a comma as seperator.
$0 = whole line
$1 = John Smith
$2 = Lodon
$3 = 123
etc
So to print out all the name will be
awk -F, '{print $2}' input_file_with_above_content.csv
See the pattern part can be omit, so it will run each line against the action. Pattern can be specified to find out the names that have second field/column as Bath
awk -F, '/Bath/ {print $1}' input_file_with_above_content.csv
This is very useful when you can pipe it into uniq command or wc command.
Patterns
Awk support regular expression - it will depend on what version of awk you have, but they should all support the most basic ones
/abc/ - match abc
/^abc/ - match line begining with abc
/abc$/ - match line end with abc
/(GET|POST|PUT|DELETE) - will match any of HTTP verbs
it also support boolean expressions and operators
&& - and
|| - or
! - not
== - like php fuzzy match e.g "23" == 23 is true
!= - not equal
> , < etc
~ - regex match for variable e.g: string ~ /regex/
!~ - regex not match for string or variable
all these can be combine
/POST/ || debug == true
Will match POST or when the debug variable is true
Actions
Action statements are enclosed by curly bracket - { Action; Action; Action } There are tons awk actions, here are some common ones
{ print $0 } - print $0, aka the whole line, or the $0 can be omit
{ exit; } - exit the program
{ next; } - skips to the next line of input
{ a=$1; b=$0 } - variable assignment
{ a[$1] = $3 } - array assignment
Conditional:
{ if (expression/boolean) { Action }
else if (express/boolean) { Action }
else { Action }
}
Loops:
{ for (item in a) { Action } }
{ for (i=1; i < x;i++) { Action } }
As you can see from the actions, it resemble a tiny programming language, but all variables in awk are global, so be careful.
Functions
They just like normal functions in any C like language
function name(value) {
Actions;
}
and it can be call inside the Action part
{ name($1) }
Special Variables
These are special variables, some can be modify and some can’t, see MAN page for more detail