Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

The AWK Programming Language by Aho, Kernighan & Weinberger (archive.org link)
Effective AWK, 4th edition by Robbins (see The GNU AWK Users Guide below for latest online version)
Effective AWK, 3rd edition by Robbins
Sed & Awk, 2nd edition by Dougherty & Robbins
Sed & Awk Pocket Reference, 2nd Edition by Arnold Robbins
AWK Language Programming - free book
Awk One-Liners Explained
GNU AWK one-liners by Sundeep Agarwal (includes a chapter on regular expressions)

Resources:

Awk.Info (archive.org link)
The GNU Awk User's Guide
POSIX specification of awk
Idiomatic awk
The awk programming language tutorial site
Awk one-liners
Awk one-liners explained

Other StackExchange Resources:

Related tags:

gawk (GNU's version of awk)
nawk (A very old, pre-POSIX version also from AT&T)
mawk (A different interpreter written by Mike Brennan)
sed (A kindred tool often mentioned in the same breath)

32722 questions

votes

5 answers

About awk and integer to ASCII character conversion

Just to make sure, is it really that using awk (Gnu awk at least) I can convert: from octal to ASCII by: print "\101" # or a="\101" A from hex to ASCII: print "\x41" # or b="\x41" B but from decimal to ASCII I have to: $ printf…

awk gawk

asked Dec 27 '16 at 21:51

James Brown

36,089
7
43
59

votes

3 answers

How to take multiple argument in bash and pass them to awk?

I am writing a function in which I am replacing the leading/trailing space from the column and if there is no value in the column replace it with null. Function is working fine for one column but how can i modify it for multiple columns. Function…

bash function unix awk arguments

asked Nov 11 '16 at 11:47

VIPIN KUMAR

3,019
1
23
34

votes

2 answers

How do I use awk to extract data within nested delimiters using non-greedy regexps

This question occurs repeatedly in many forms with many different multi-character delimiters and so IMHO is worth a canonical answer. Given an input file like: .. 1 .. a<2 .. .. .. @{<>}@ .. 4 .. ..…

awk

asked Nov 09 '16 at 17:14

Ed Morton

188,023
17
78
185

votes

1 answer

AWK - replace specific column on matching line, then print other lines

I am trying to alter a column/field within a 'header' line of DNA sequences that is thousands of lines long. Specifically, I want to change the first field of the header (compX_seqy), which ALWAYS starts with ">": An example of just the first two…

awk sed

asked Nov 07 '16 at 16:42

LP_640

votes

3 answers

AWK/BASH: how to match a field in one file from a field in another?

I have 2 files, the first contains the following: ... John Allen Smith II 16 555-555-5555 10/24/2010 John Allen Smith II 3 555-555-5555 10/24/2010 John Allen Smith II 17 555-555-5555 10/24/2010 John Doe 16 555-555-5555 10/24/2010 Jane Smith 16…

bash shell file awk

asked Oct 16 '10 at 05:31

Tomek

4,689
15
44
52

votes

1 answer

asort(src,dest) to a multidimensional array

I'm trying to abuse asort() (just because) to copy an array src to array dest, no problem there: $ awk 'BEGIN { split("first;second;third",src,";") # make src array for testing asort(src, dest, "@ind_num_asc") # copy array to dest …

arrays sorting multidimensional-array awk gawk

asked Sep 02 '16 at 09:21

James Brown

36,089
7
43
59

votes

4 answers

Change AWK field separator on the fly

I'd like to use AWK to take the following spread sheet where the first name and last name are in one column: Peter Griffin, 31 Spooner St, Quahog Homer Simpson, 732 Evergreen Terr, Springfield Fred Flintstone, 301 Cobblestone Way, Bedrock and…

linux bash awk

asked Aug 30 '16 at 15:37

Ryan R

votes

3 answers

Why do 4 different languages give 4 different results here?

Consider this (all commands run on an 64bit Arch Linux system): Perl (v5.24.0) $ perl -le 'print 10190150730169267102/1000%10' 6 awk (GNU Awk 4.1.3) $ awk 'BEGIN{print 10190150730169267102/1000%10}' 6 R (3.3.1) >…

python perl awk rounding long-integer

asked Aug 23 '16 at 11:54

terdon

3,260
5
33
57

votes

1 answer

Compare consecutive rows and multiple columns in awk and random select one of duplicate lines

I read the question: Compare consecutive rows in awk/(or python) and random select one of duplicate lines . Now I have some additional question: How should I change the code, if I want to do this comparison not only for the x-value, but also for the…

bash awk sed

asked Jul 22 '16 at 23:09

Jojo

votes

2 answers

Using sed to match on multiple patterns with one expression, and delete until a blank line

On a RHEL 6.6 system, using ifconfig and GNU sed, I want to display only the Ethernet interfaces which aren't logical sub interfaces, or the loopback. For example, the output should not contain interface records where the interface name is like…

regex bash awk sed

asked Jul 12 '16 at 20:48

Chris

votes

5 answers

Converting lines in chunks into tab delimited

I have the following lines in 2 chunks (actually there are ~10K of that). And in this example each chunk contain 3 lines. The chunks are separated by an empty line. So the chunks are like "paragraphs". xox 91-233 chicago koko 121-111 alabama I…

csv awk sed newline

asked Jun 29 '16 at 01:23

neversaint

60,904
137
310
477

votes

2 answers

Awk/sed replace newlines

Intro: I have been given a CSV file in which the field delimiter is the pipe characted (i.e., |). This file has a pre-defined number of fields (say N). I can discover the value of N by reading the header of the CSV file, which we can assume to be…

shell csv awk replace

asked Jun 27 '16 at 16:33

user2340612

10,053
4
41
66

votes

2 answers

Reading value from an ini style file with sed/awk

I wrote a simple bash function which would read value from an ini file (defined by variable CONF_FILE) and output it getConfValue() { #getConfValue section variable #return value of a specific variable from given section of a conf file …

bash awk sed ini

asked Jun 16 '16 at 19:45

smihael

votes

3 answers

Remove duplicate lines and overwrite file in same command

I'm trying to remove duplicate lines from a file and update the file. For some reason I have to write it to a new file and replace it. Is this the only way? awk '!seen[$0]++' .gitignore > .gitignore awk '!seen[$0]++' .gitignore > .gitignore_new &&…

bash awk

asked Jun 11 '16 at 20:47

ThomasReggi

55,053
85
237
424

votes

3 answers

awk set variable to line after if statement match

I have the following text... BIOS Information Manufacturer : Dell Inc. Version : 2.5.2 Release Date : 01/28/2015 Firmware Information Name : iDRAC7 Version :…

awk

asked Jun 03 '16 at 14:07

user1601716

1,893
4
24
53

Prev 1 2 3

…

99 100 Next