Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

The AWK Programming Language by Aho, Kernighan & Weinberger (archive.org link)
Effective AWK, 4th edition by Robbins (see The GNU AWK Users Guide below for latest online version)
Effective AWK, 3rd edition by Robbins
Sed & Awk, 2nd edition by Dougherty & Robbins
Sed & Awk Pocket Reference, 2nd Edition by Arnold Robbins
AWK Language Programming - free book
Awk One-Liners Explained
GNU AWK one-liners by Sundeep Agarwal (includes a chapter on regular expressions)

Resources:

Awk.Info (archive.org link)
The GNU Awk User's Guide
POSIX specification of awk
Idiomatic awk
The awk programming language tutorial site
Awk one-liners
Awk one-liners explained

Other StackExchange Resources:

Related tags:

gawk (GNU's version of awk)
nawk (A very old, pre-POSIX version also from AT&T)
mawk (A different interpreter written by Mike Brennan)
sed (A kindred tool often mentioned in the same breath)

32722 questions

votes

1 answer

using awk in tcl script

I want to print a particular column number fields in a file while in TCL script. I tried with exec awk '{print $4}' foo where foo is filename, but it is not working as it gives error can't read "4": no such variable How can I do above awk in tcl…

regex awk tcl

asked Jun 07 '13 at 17:50

ravi

3,304
6
25
27

votes

2 answers

Convert .gitignore to rsync merge filter include file? (with sed or awk)

I tried using rsync --filter=':+ .gitignore' (-/exclude works but not include) to no avail. Basically i just want to include the .ignore file in a script and upload everything in it with rsync to the remote. If anyone would have the skills to sed…

git sed awk rsync

asked May 27 '13 at 01:32

sabgenton

1,823
1
12
20

votes

1 answer

Awk merge the results of processing two files into a single file

I use awk to extract and calculate information from two different files and I want to merge the results into a single file in columns ( for example, the output of first file in columns 1 and 2 and the output of the second one in 3 and 4 ). The…

awk

asked May 23 '13 at 18:14

user2245731

votes

3 answers

Convert exponentials and rounding numbers in BASH

i have such a file 1.5000000000E-01 7.5714285714E+00 4.0000000000E-01 2.5000000000E-01 7.5714285714E+00 4.0000000000E-01 and i have to convert it to something like 0.15 7.57 0.40 i mean i want the numbers to be with only 2 decimals and not to be…

bash awk

asked May 22 '13 at 08:19

ayasha

1,221
5
27
46

votes

3 answers

Print package dependency tree

Using this file, I would like to print a tree of package dependencies, given a single base package. For example, take the Bash package @ bash # few lines removed requires: coreutils libintl8 libncursesw10 libreadline7 _update-info-dir cygwin I…

bash recursion awk circular-dependency

asked May 16 '13 at 05:10

Zombo

votes

6 answers

How to add single quotes around columns using awk

Just wondering how can I add single quotes around fields, so I can import it to mysql without warnings or errors. I have a csv file with lots of content. 16:47:11,3,r-4-VM,250000000.,0.50822578824,131072,0,0,0,0,0 Desired output…

bash sed awk

asked May 10 '13 at 20:55

Deano

11,582
18
69
119

votes

4 answers

How to delete an entire row if a specific column contains a zero?

I need to delete the rows which contains 0 value in column number 5. Before: 1QWO 10 5 45 100 7.5 1R0R 6 3 15 100 8.5 1R5M 4 0 6 0 6.5 1R8S 4 0 6 0 6 1R9L 2 …

perl sed awk

asked May 01 '13 at 08:05

user2176228

votes

2 answers

sort a file based on a column in another file

I have two files both in the format of: loc1 num1 num2 loc2 num3 num4 The first column is the location and I want to use the order of the locations in the first file to sort the second file so that I can put the two files together where the…

linux shell sed awk

asked Apr 29 '13 at 17:20

olala

4,146
9
34
44

votes

3 answers

awk, print lines which start with four digits

I want to print all lines from a file which begin with four digits. I tried this allredy but it does not work: cat data.txt | awk --posix '{ if ($1 ~ /^[0-9]{4}/) print $1}' No output is generated The next line prints all lins which start with a…

awk

asked Apr 29 '13 at 14:45

Lukas Banach

votes

3 answers

Picking up CSV fields by name using awk

Suppose I have a CSV file with headers of the following form: Field1,Field2 3,262000 4,449000 5,650000 6,853000 7,1061000 8,1263000 9,1473000 10,1683000 11,1893000 I would like to write an awk script which will take a comma-separated list of field…

csv awk gawk

asked Apr 18 '13 at 20:10

merlin2011

71,677
44
195
329

votes

2 answers

Linux - numerical sort then overwrite file

I have a csv file with a general format date, 2013.04.04, 2013.04.04, 2012.04.02, 2013.02.01, 2013.04.05, 2013.04.02, a script I run will add data to this file which will not necessarily be in date order. How can I sort the file into date order…

linux sorting awk

asked Apr 18 '13 at 12:11

moadeep

3,988
10
45
72

votes

2 answers

Command to replace specific column of csv file for first 100 rows

Following command is replacing second column with value e in a complete csv file, but what if i want to replace only in first 100 rows. awk -F, '{$2="e";}1' OFS=, file Rest of the rows of csv file should be intact..

awk

asked Apr 18 '13 at 10:44

user752590

votes

3 answers

Using awk and df (disk free) to show only mount name and space used

What would be the correct CL sequence to execute a df -h and only print out the mount name and used space (percentage)? I'm trying to do a scripted report for our servers. I tried df -h | awk '{print $1 $4}' which spits out $df -h | awk '{print $1…

bash awk

asked Apr 16 '13 at 19:30

Rick

votes

5 answers

how to insert a newline \n after x numbers of words, with AWK or Sed

I have this in one line: We were born in the earth beyond the land I want it in 3 words lines, to be like this: We were born in the earth beyond the land

linux bash shell unix awk

asked Apr 12 '13 at 19:53

user2275669

votes

5 answers

Move line(s) to follow another line in a file

I got a file that has a line in the file like this: check=('78905905f5a4ed82160c327f3fd34cba') I'd like to be able to move this line to follow a line that looks like this: files=('somefile.txt') The array though at times that can span multiple…

linux awk

asked Oct 20 '09 at 22:20

Todd Partridge 'Gen2ly'

2,258
2
19
18

Prev 1 2 3

…

99 100 Next