Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

  • awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
  • mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
  • nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
  • gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

Resources:

Other StackExchange Resources:

Related tags:

  • (GNU's version of awk)
  • (A very old, pre-POSIX version also from AT&T)
  • (A different interpreter written by Mike Brennan)
  • (A kindred tool often mentioned in the same breath)
32722 questions
32
votes
7 answers

Awk to skip the blank lines

The output of my script is tab delimited using awk as : awk -v variable=$bashvariable '{print variable"\t single\t" $0"\t double"}' myinfile.c The awk command is run in a while loop which updates the variable value and the file myinfile.c for…
Gil
  • 1,518
  • 4
  • 16
  • 32
31
votes
7 answers

Awk consider double quoted string as one token and ignore space in between

Data file - data.txt: ABC "I am ABC" 35 DESC DEF "I am not ABC" 42 DESC cat data.txt | awk '{print $2}' will result the "I" instead of the string being quoted How to make awk so that it ignore the space within the quote and think that it is one…
Roy Chan
  • 2,888
  • 6
  • 36
  • 43
31
votes
5 answers

How to insert a line using sed before a pattern and after a line number?

How to insert a line into a file using sed before a pattern and after a line number? And how to use the same in shell script? This inserts a line before every line with the pattern : sed '/Sysadmin/i \ Linux Scripting' filename.txt And this changes…
Nohsib
  • 3,614
  • 14
  • 51
  • 63
31
votes
3 answers

String comparison in awk

I need to compare two strings in alphabetic order, not only equality test. I want to know is there way to do string comparison in awk?
Dagang
  • 24,586
  • 26
  • 88
  • 133
31
votes
4 answers

ignorecase in AWK

The following command is working as expected. # some command | awk '/(\<^create\>|\<^alter\>|\<^drop\>)/,/;/' create table todel1 (id int) max_rows=2 /*!*/; alter table todel1 engine=InnoDB /*!*/; create database common /*!*/; create database…
shantanuo
  • 31,689
  • 78
  • 245
  • 403
31
votes
21 answers

Check if all of multiple strings or regexes exist in a file

I want to check if all of my strings exist in a text file. They could exist on the same line or on different lines. And partial matches should be OK. Like this: ... string1 ... string2 ... string3 ... string1 string2 ... string1 string2…
codeforester
  • 39,467
  • 16
  • 112
  • 140
31
votes
6 answers

matching a line with a literal asterisk "*" in grep

Tried $ echo "$STRING" | egrep "(\*)" and also $ echo "$STRING" | egrep '(\*)' and countless other variations. I just want to match a line that contains a literal asterisk anywhere in the line.
Derrick
  • 2,356
  • 5
  • 32
  • 43
31
votes
8 answers

How to compare two decimal numbers in bash/awk?

I am trying to compare two decimal values but I am getting errors. I used if [ "$(echo $result1 '>' $result2 | bc -l)" -eq 1 ];then as suggested by the other Stack Overflow thread. I am getting errors. What is the correct way to go about this?
user244333
31
votes
2 answers

Awk if else issues

Bash points an arrow to "else" and says "syntax error" in a provocative whining tone. awk '{if($3 != 0) a = ($3/$4) print $0, a; else if($3==0) print $0, "-" }' file > out Why?
AWE
  • 4,045
  • 9
  • 33
  • 42
30
votes
4 answers

Looping over input fields as array

Is it possible to do something like this: $ cat foo.txt 1 2 3 4 foo bar baz hello world $ awk '{ for(i in $){ print $[i]; } }' foo.txt 1 2 3 4 foo bar baz hello world I know you could do this: $ awk '{ split($0,array," "); for(i in array){ print…
Tyilo
  • 28,998
  • 40
  • 113
  • 198
30
votes
3 answers

Casting to int in awk

I'm looking for a method to cast a string to an int in awk. I have the following which appears to be doing a string comparison (note: field $5 is a percentage in one of two formats: 80% or 9.0%) awk '{if (substr($5,1,(length($5)-1)) >= 90) ... So,…
xelco52
  • 5,257
  • 4
  • 40
  • 56
30
votes
2 answers

What does a number do after curly braces?

Why does echo foo bar..baz bork | awk 'BEGIN{RS=".."} {gsub(OFS,"\t");}1' seem to do the same thing as echo foo bar..baz bork | awk 'BEGIN{RS=".."} {gsub(OFS,"\t");} {print;}' ? In fact any number that isn't zero (including decimals and negatives)…
shadowtalker
  • 12,529
  • 3
  • 53
  • 96
30
votes
4 answers

Compute average and standard deviation with awk

I have a 'file.dat' with 24 (rows) x 16 (columns) data. I have already tested the following awk script that computes de average of each column. touch aver-std.dat awk '{ for (i=1; i<=NF; i++) { sum[i]+= $i } } END { for (i=1; i<=NF; i++ ) {…
PLM
  • 399
  • 1
  • 3
  • 4
30
votes
6 answers

Add double quotes around fields in AWK script output?

I have written an awk script that converts a distributor flatfile into a CSV importable into Magento. This file is semi-colon delimited. It is not putting quotes around each field like the importer requires. It works fairly well, but is causing some…
John Steensen
  • 357
  • 1
  • 4
  • 9
30
votes
2 answers

Print all Fields with AWK separated by OFS

Is there a way to print all records separated by the OFS without typing out each column number. #Desired style of syntax, undesired result [kbrandt@glade: ~] echo "1 2 3 4" | gawk 'BEGIN { OFS=" :-( "}; {print $0}' 1 2 3 4 #Desired result,…
Kyle Brandt
  • 26,938
  • 37
  • 124
  • 165