Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

The AWK Programming Language by Aho, Kernighan & Weinberger (archive.org link)
Effective AWK, 4th edition by Robbins (see The GNU AWK Users Guide below for latest online version)
Effective AWK, 3rd edition by Robbins
Sed & Awk, 2nd edition by Dougherty & Robbins
Sed & Awk Pocket Reference, 2nd Edition by Arnold Robbins
AWK Language Programming - free book
Awk One-Liners Explained
GNU AWK one-liners by Sundeep Agarwal (includes a chapter on regular expressions)

Resources:

Awk.Info (archive.org link)
The GNU Awk User's Guide
POSIX specification of awk
Idiomatic awk
The awk programming language tutorial site
Awk one-liners
Awk one-liners explained

Other StackExchange Resources:

Related tags:

gawk (GNU's version of awk)
nawk (A very old, pre-POSIX version also from AT&T)
mawk (A different interpreter written by Mike Brennan)
sed (A kindred tool often mentioned in the same breath)

32722 questions

114

votes

8 answers

Tab separated values in awk

How do I select the first column from the TAB separated string? # echo "LOAD_SETTLED LOAD_INIT 2011-01-13 03:50:01" | awk -F'\t' '{print $1}' The above will return the entire line and not just "LOAD_SETTLED" as expected. Update: I need to…

awk

asked Mar 21 '11 at 05:43

shantanuo

31,689
78
245
403

112

votes

4 answers

Printing column separated by comma using Awk command line

I have a problem here. I have to print a column in a text file using awk. However, the columns are not separated by spaces at all, only using a single comma. Looks something like this: column1,column2,column3,column4,column5,column6 How would I…

csv awk

asked Nov 10 '14 at 11:14

user3364728

1,147
2
7
5

108

votes

11 answers

How to print all the columns after a particular number using awk?

On shell, I pipe to awk when I need a particular column. This prints column 9, for example: ... | awk '{print $9}' How can I tell awk to print all the columns including and after column 9, not just column 9?

shell awk

asked Feb 22 '11 at 17:57

Lazer

90,700
113
281
364

107

votes

4 answers

Select row and element in awk

I learned that in awk, $2 is the 2nd column. How to specify the ith line and the element at the ith row and jth column?

awk

asked Oct 01 '09 at 21:12

Tim

107

votes

5 answers

Using awk to remove the Byte-order mark

How would an awk script (presumably a one-liner) for removing a BOM look like? Specification: print every line after the first (NR > 1) for the first line: If it starts with #FE #FF or #FF #FE, remove those and print the rest

unicode awk byte-order-mark

asked Jul 01 '09 at 11:37

Boldewyn

81,211
44
156
212

106

votes

5 answers

How to print the number of characters in each line of a text file

I would like to print the number of characters in each line of a text file using a unix command. I know it is simple with powershell gc abc.txt | % {$_.length} but I need unix command.

shell unix sed awk

asked Jan 09 '12 at 10:00

vikas368

1,408
2
10
13

105

votes

10 answers

cut or awk command to print first field of first row

I am trying print the first field of the first row of an output. Here is the case. I just need to print only SUSE from this output. # cat /etc/*release SUSE Linux Enterprise Server 11 (x86_64) VERSION = 11 PATCHLEVEL = 2 Tried with cat…

linux bash unix awk

asked Mar 05 '14 at 07:05

user3331975

2,647
7
28
30

105

votes

12 answers

Split one file into multiple files based on delimiter

I have one file with -| as delimiter after each section...need to create separate files for each section using unix. example of input…

linux unix awk split

asked Jul 03 '12 at 15:07

user1499178

1,059
2
8
3

104

votes

25 answers

How to decode URL-encoded string in shell?

I have a file with a list of user-agents which are encoded. E.g.: Mozilla%2F5.0%20%28Macintosh%3B%20U%3B%20Intel%20Mac%20OS%20X%2010.6%3B%20en I want a shell script which can read this file and write to a new file with decoded strings. Mozilla/5.0…

bash shell awk sed urldecode

asked Jun 06 '11 at 10:28

user785717

1,245
2
9
8

103

votes

8 answers

Turning multiple lines into one comma separated line

I have the following data in multiple lines: foo bar qux zuu sdf sdfasdf What I want to do is to convert them to one comma separated line: foo,bar,qux,zuu,sdf,sdfasdf What's the best unix one-liner to do that?

linux perl unix sed awk

asked Apr 02 '13 at 07:48

neversaint

60,904
137
310
477

102

votes

13 answers

how to use sed, awk, or gawk to print only what is matched?

I see lots of examples and man pages on how to do things like search-and-replace using sed, awk, or gawk. But in my case, I have a regular expression that I want to run against a text file to extract a specific value. I don't want to do…

regex unix sed awk gawk

asked Nov 14 '09 at 08:34

Stéphane

19,459
24
95
136

votes

11 answers

find difference between two text files with one item per line

I have two files: file 1 dsf sdfsd dsfsdf file 2 ljljlj lkklk dsf sdfsd dsfsdf I want to display what is in file 2 but not in file 1, so file 3 should look like ljljlj lkklk

bash file scripting sed awk

asked Nov 02 '10 at 15:01

vehomzzz

42,832
72
186
216

votes

6 answers

How to add to the end of lines containing a pattern with sed or awk?

Here is example file: somestuff... all: thing otherthing some other stuff What I want to do is to add to the line that starts with all: like this: somestuff... all: thing otherthing anotherthing some other stuff

bash sed awk

asked Mar 06 '12 at 20:54

yasar

13,158
28
95
160

votes

17 answers

grep for multiple strings in file on different lines (ie. whole file, not line based search)?

I want to grep for files containing the words Dansk, Svenska or Norsk on any line, with a usable returncode (as I really only like to have the info that the strings are contained, my one-liner goes a little further then this). I have many files…

bash awk grep

asked Jan 25 '11 at 15:28

Christian

votes

12 answers

Insert multiple lines into a file after specified pattern using shell script

I want to insert multiple lines into a file using shell script. Let us consider my input file contents are: input.txt: abcd accd cdef line web Now I have to insert four lines after the line 'cdef' in the input.txt file. After inserting my file…

linux bash shell sed awk

asked Mar 19 '14 at 05:43

user27

1,557
4
16
16

Prev 1 2 3

…

99 100 Next