Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

  • awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
  • mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
  • nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
  • gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

Resources:

Other StackExchange Resources:

Related tags:

  • (GNU's version of awk)
  • (A very old, pre-POSIX version also from AT&T)
  • (A different interpreter written by Mike Brennan)
  • (A kindred tool often mentioned in the same breath)
32722 questions
64
votes
9 answers

Replace whitespace with a comma in a text file in Linux

I need to edit a few text files (an output from sar) and convert them into CSV files. I need to change every whitespace (maybe it's a tab between the numbers in the output) using sed or awk functions (an easy shell script in Linux). Can anyone help…
aye
63
votes
11 answers

Uppercasing First Letter of Words Using SED

How do you replace the first letter of a word into Capital letter, e.g. Trouble me Gold rush brides into Trouble Me Gold Rush Brides
neversaint
  • 60,904
  • 137
  • 310
  • 477
62
votes
5 answers

Print the last line of a file, from the CLI

How to print just the last line of a file?
yael
  • 2,765
  • 10
  • 40
  • 48
59
votes
5 answers

How can I pass variables from awk to a shell command?

I am trying to run a shell command from within awk for each line of a file, and the shell command needs one input argument. I tried to use system(), but it didn't recognize the input argument. Each line of this file is an address of a file, and I…
Vahid Mirjalili
  • 6,211
  • 15
  • 57
  • 80
59
votes
4 answers

How to check if the variable value in AWK script is null or empty?

I am using AWK script to process some logs. At one place I need to check if the variable value is null or empty to make some decision. Any Idea how to achieve the same? awk ' { { split($i, keyVal, "@") key=keyVal[1]; …
samarth
  • 3,866
  • 7
  • 45
  • 60
58
votes
3 answers

How to match a pattern given in a variable in awk?

I want to extract a substring where certain pattern exist from pipe separated file, thus I used below command, awk -F ":" '/REWARD REQ. SERVER HEADERS/{print $1, $2, $3, $4}' sample_profile.txt Here, 'REWARD REQ. SERVER HEADERS' is a pattern which…
Chintamani Manjare
  • 1,543
  • 1
  • 13
  • 28
56
votes
8 answers

How to remove blank lines from a Unix file

I need to remove all the blank lines from an input file and write into an output file. Here is my data as…
Teja
  • 13,214
  • 36
  • 93
  • 155
55
votes
6 answers

How do I add a line of text to the middle of a file using bash?

I'm trying to add a line of text to the middle of a text file in a bash script. Specifically I'm trying add a nameserver to my /etc/resolv.conf file. As it stands, resolv.conf looks like this: # Generated by NetworkManager domain…
PHLAK
  • 22,023
  • 18
  • 49
  • 52
55
votes
5 answers

Multiple strings, Truncate line at 80 characters

I'm new to awk and sed, and I'm looking for a way to truncate a line at 80 characters, but I'm printing several strings in that line using printf. The last two strings are the ones that give me problems because they vary in size on each iteration of…
TZPike05
  • 1,188
  • 5
  • 15
  • 24
55
votes
4 answers

Scripts for computing the average of a list of numbers in a data file

The file data.txt contains the following: 1.00 1.23 54.4 213.2 3.4 The output of the scripts are supposed to be: ave: 54.646 Some simple scripts are preferred.
JackWM
  • 10,085
  • 22
  • 65
  • 92
55
votes
9 answers

In AWK, is it possible to specify "ranges" of fields?

In AWK, is it possible to specify "ranges" of fields? Example. Given a tab-separated file "foo" with 100 fields per line, I want to print only the fields 32 to 57 for each line, and save the result in a file "bar". What I do now: awk…
user438602
54
votes
4 answers

Using curl in a bash script and getting curl: (3) Illegal characters found in URL

So I have a very simple bash script that is curl'ing to an auth server for a header. The header url is written to a var and then used in the next curl call. When using the var set in the first curl call I am getting "curl: (3) Illegal characters…
gregwinn
  • 941
  • 1
  • 9
  • 16
54
votes
6 answers

Can I use awk to convert all the lower-case letters into upper-case?

I have a file mixed with lower-case letters and upper-case letters, can I use awk to convert all the letters in that file into upper-case?
Yishu Fang
  • 9,448
  • 21
  • 65
  • 102
54
votes
5 answers

How can I set the grep after context to be "until the next blank line"?

With grep I know how to set the context to a fixed number of lines. Is it possible to show a context based on an arbitrary string condition, like set after-context to "until the next blank line"? Or possibly some other combination of…
pixelearth
  • 13,674
  • 10
  • 62
  • 110
53
votes
5 answers

Splitting a file in linux based on content

I have an email dump of around 400mb. I want to split this into .txt files, consisting of one mail in each file. Every e-mail starts with the standard HTML header specifying the doctype. This means I will have to split my files based on the above…
Greenhorn
  • 1,811
  • 5
  • 21
  • 39