Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

  • awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
  • mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
  • nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
  • gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

Resources:

Other StackExchange Resources:

Related tags:

  • (GNU's version of awk)
  • (A very old, pre-POSIX version also from AT&T)
  • (A different interpreter written by Mike Brennan)
  • (A kindred tool often mentioned in the same breath)
32722 questions
237
votes
3 answers

How to grep for case insensitive string in a file?

I have a file file1 which ends with Success... OR success... I want to grep for the word success in a way which is not case sensitive way. I have written the following command but it is case sensitive cat file1 | grep "success\.\.\." How can i…
all_techie
  • 2,761
  • 2
  • 11
  • 19
235
votes
8 answers

How to get the second column from command output?

My command's output is something like: 1540 "A B" 6 "C" 119 "D" The first column is always a number, followed by a space, then a double-quoted string. My purpose is to get the second column only, like: "A B" "C" "D" I intended to use…
Qiang Xu
  • 4,353
  • 8
  • 36
  • 45
221
votes
7 answers

awk without printing newline

I want the variable sum/NR to be printed side-by-side in each iteration. How do we avoid awk from printing newline in each iteration ? In my code a newline is printed by default in each iteration for file in cg_c ep_c is_c tau xhpl printf "\n $file"…
Sharat Chandra
  • 4,434
  • 7
  • 49
  • 66
215
votes
12 answers

Insert a line at specific line number with sed or awk

I have a script file which I need to modify with another script to insert a text at the 8th line. String to insert: Project_Name=sowstest, into a file called start. I tried to use awk and sed, but my command is getting garbled.
ashok
  • 2,161
  • 2
  • 13
  • 4
213
votes
10 answers

Print second-to-last column/field in `awk`

I want to print the second-to-last column or field in awk. The number of fields is the NF variable. I know that I should be able to use $NF, but I'm not sure how it can be used. And this does not seem to work: awk ' { print ( $NF-- ) } '
Brian G
  • 53,704
  • 58
  • 125
  • 140
209
votes
21 answers

How to merge every two lines into one from the command line?

I have a text file with the following format. The first line is the "KEY" and the second line is the "VALUE". KEY 4048:1736 string 3 KEY 0:1772 string 1 KEY 4192:1349 string 1 KEY 7329:2407 string 2 KEY 0:1774 string 1 I need the value in the same…
shantanuo
  • 31,689
  • 78
  • 245
  • 403
207
votes
8 answers

What is the shortest way to get n-th column of an output?

Let's say that during your workday you repeatedly encounter the following form of columnized output from some command in bash (in my case from executing svn st in my Rails working directory): ? changes.patch M app/models/superman.rb A …
Sv1
  • 2,414
  • 2
  • 18
  • 13
200
votes
9 answers

How to delete duplicate lines in a file without sorting it in Unix

Is there a way to delete duplicate lines in a file in Unix? I can do it with sort -u and uniq commands, but I want to use sed or awk. Is that possible?
Vijay
  • 65,327
  • 90
  • 227
  • 319
196
votes
13 answers

Sort a text file by line length including spaces

I have a CSV file that looks like this AS2345,ASDF1232, Mr. Plain Example, 110 Binary ave.,Atlantis,RI,12345,(999)123-5555,1.56 AS2345,ASDF1232, Mrs. Plain Example, 1121110 Ternary st. 110 Binary…
gnarbarian
  • 2,622
  • 2
  • 19
  • 25
192
votes
11 answers

Printing the last column of a line in a file

I have a file that is constantly being written to/updated. I want to find the last line containing a particular word, then print the last column of that line. The file looks something like this. More A1/B1/C1 lines will be appended to it over…
Rayne
  • 14,247
  • 16
  • 42
  • 59
184
votes
19 answers

Is there a Unix utility to prepend timestamps to stdin?

I ended up writing a quick little script for this in Python, but I was wondering if there was a utility you could feed text into which would prepend each line with some text -- in my specific case, a timestamp. Ideally, the use would be something…
Joe Shaw
  • 22,066
  • 16
  • 70
  • 92
181
votes
23 answers

How can I delete a newline if it is the last character in a file?

I have some files that I'd like to delete the last newline if it is the last character in a file. od -c shows me that the command I run does write the file with a trailing new line: 0013600 n t > \n I've tried a few tricks with sed but the…
Todd Partridge 'Gen2ly'
  • 2,258
  • 2
  • 19
  • 18
172
votes
7 answers

Save modifications in place with awk

I am learning awk and I would like to know if there is an option to write changes to file, similar to sed where I would use -i option to save modifications to a file. I do understand that I could use redirection to write changes. However is there…
Deano
  • 11,582
  • 18
  • 69
  • 119
159
votes
10 answers

How to select lines between two marker patterns which may occur multiple times with awk/sed

Using awk or sed how can I select lines which are occurring between two different marker patterns? There may be multiple sections marked with these patterns. For example: Suppose the file contains:…
dvai
  • 1,953
  • 3
  • 13
  • 15
155
votes
9 answers

How to print matched regex pattern using awk?

Using awk, I need to find a word in a file that matches a regex pattern. I only want to print the word matched with the pattern. So if in the line, I have: xxx yyy zzz And pattern: /yyy/ I want to only get: yyy EDIT: thanks to kurumi i managed…
marverix
  • 7,184
  • 6
  • 38
  • 50