Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

The AWK Programming Language by Aho, Kernighan & Weinberger (archive.org link)
Effective AWK, 4th edition by Robbins (see The GNU AWK Users Guide below for latest online version)
Effective AWK, 3rd edition by Robbins
Sed & Awk, 2nd edition by Dougherty & Robbins
Sed & Awk Pocket Reference, 2nd Edition by Arnold Robbins
AWK Language Programming - free book
Awk One-Liners Explained
GNU AWK one-liners by Sundeep Agarwal (includes a chapter on regular expressions)

Resources:

Awk.Info (archive.org link)
The GNU Awk User's Guide
POSIX specification of awk
Idiomatic awk
The awk programming language tutorial site
Awk one-liners
Awk one-liners explained

Other StackExchange Resources:

Related tags:

gawk (GNU's version of awk)
nawk (A very old, pre-POSIX version also from AT&T)
mawk (A different interpreter written by Mike Brennan)
sed (A kindred tool often mentioned in the same breath)

32722 questions

votes

4 answers

Extract using sed or grep

I am fairly new to grep and sed commands.How can +50.0 be extracted from Core 0: +50.0°C (high = +80.0°C, crit = +90.0°C) using grep or sed in bash script? acpitz-virtual-0 Adapter: Virtual device temp1: +50.0°C (crit =…

regex linux sed awk grep

asked Sep 05 '13 at 17:47

curious_coder

2,392
4
25
44

votes

3 answers

Using the bash sort command within variable-length filenames

I am trying to numerically sort a series of files output by the ls command which match the pattern either ABCDE1234A1789.RST.txt or ABCDE12345A1789.RST.txt by the '789' field. In the example patterns above, ABCDE is the same for all files, 1234 or…

perl bash sorting command-line awk

asked Sep 04 '13 at 20:40

Michael Meech

votes

2 answers

Join multiple tables by row names

I would like to merge multiple tables by row names. The tables differ in the amount of rows and they have unique and shared rows, which should all appear in output. If possible I would like to solve the problem with awk, but I am also fine with…

bash shell join awk

asked Aug 25 '13 at 09:56

user2715173

votes

5 answers

Awk between two patterns with pattern in the middle

Hi i am looking for an awk that can find two patterns and print the data between them to a file only if in the middle there is a third patterns in the middle. for example: Start 1 2 middle 3 End Start 1 2 End And the output will…

linux shell unix awk sh

asked Aug 19 '13 at 17:33

Ggdw

2,509
5
24
22

votes

1 answer

How to specify one tab as field separator in AWK?

The default for white-space field separators, such as tab when using FS = "\t", in AWK is either one or many. Therefore, if you want to read in a tab separated file with null values in some columns (other than the last), it skips over them. For…

tabs awk field separator

asked Aug 08 '13 at 01:02

user2662766

votes

3 answers

Combining columns within a single file using awk

I am trying to reformat a large file. The first 6 columns of each line are OK but the rest of the columns in the line need to be combined in increments of 2 with a "/" character in between. Example file (showing only a few columns but have many…

awk

asked Aug 06 '13 at 23:57

KBoehme

votes

3 answers

finding duplicates in a field and printing them in unix bash

I have a file the contains apple apple banana orange apple orange I want a script that finds the duplicates apple and orange and tells the user that the following : apple and orange are repeated. I tried nawk '!x[$1]++' FS="," filename to find…

bash unix awk

asked Jul 29 '13 at 06:41

t28292

votes

2 answers

Replace fileds with AWK by using a different file as translation list

I am using awk in Windows. I have a script called test.awk. This script should read a file and replace a certain filed (key) with a value. The key->value list is in a file called translate.txt. It's structure is like this: e;Emil …

replace awk user-defined-functions

asked Jul 19 '13 at 09:59

Schamas

votes

2 answers

Awk - print next record following matched record

I'm trying to get a next field after matching field using awk. Is there an option to do that or do I need to scan the record into array then check each field in array and print the one after that? What I have so far: The file format…

awk

asked Nov 19 '09 at 06:47

stefanB

77,323
27
116
141

votes

4 answers

Filling in gaps with awk or anything

I have a list such as below, where the 1 column is position and the other columns aren't important for this question. 1 1 2 3 4 5 2 1 2 3 4 5 5 1 2 3 4 5 8 1 2 3 4 5 9 1 2 3 4 5 10 1 2 3 4 5 11 1 2 3 4 5 I want to fill in the gaps such that…

bash awk

asked Jul 10 '13 at 22:09

jeffpkamp

2,732
2
27
51

votes

3 answers

Extracting only my function names from ELF binary

Im writing a script for extracting all the functions (written by user) in a binary. The following shell script extracts my function names as well as some library functions which start with __. readelf -s ./a.out | gawk ' { if ($4 == "FUNC" && $3…

c linux shell awk readelf

asked Jul 05 '13 at 10:47

Jeyaram

9,158
7
41
63

votes

5 answers

What is platform independent way of converting csv files to tsv files if the csv files can be quoted with comma inside the quoted strings?

Suppose I have a csv file like this a,b,c 1,"drivingme,mad",2 and I want convert it to a TSV abc 1drivingme,mad2 Whilst I can write some Python code to do this. I found this to be slow. Is there a better awk, sed or perl way…

perl sed awk csv

asked Jun 18 '13 at 06:31

xiaodai

14,889
18
76
140

votes

7 answers

How to find files containing exactly 16 lines?

I have to find files that containing exactly 16 lines in Bash. My idea is: find -type f | grep '/^...$/' Does anyone know how to utilise find + grep or maybe find + awk? Then, Move the matching files another directory. Deleting all non-matching…

bash awk grep

asked Jun 17 '13 at 12:32

user2493340

votes

2 answers

How to supress default print in awk?

This is with gawk 4.0.0, running on Windows 7 with cygwin. The program is invoked like gawk -f procjournal.gawk testdata I have some data that looks like this: "Date";"Type";"Amount";"Balance" "6/11/2013 11:51:17 AM";"Transaction…

awk gawk

asked Jun 11 '13 at 20:36

wades

votes

7 answers

Get common values in 2 arrays in shell scripting

I have an array1 = (20,30,40,50) array2 = (10,20,30,80,100,110,40) I have to get the common values from these 2 arrays in my array 3 like: array3 = (20,30,40) in ascending sorted order.

bash shell unix sorting awk

asked Jun 10 '13 at 21:23

iaav

Prev 1 2 3

…

99 100 Next