Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

  • awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
  • mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
  • nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
  • gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

Resources:

Other StackExchange Resources:

Related tags:

  • (GNU's version of awk)
  • (A very old, pre-POSIX version also from AT&T)
  • (A different interpreter written by Mike Brennan)
  • (A kindred tool often mentioned in the same breath)
32722 questions
5
votes
2 answers

Use getline inside loop

Consider this script #!awk -f BEGIN { "date" | getline foo print foo } It will print the current date, as expected. However if you put it in a loop #!awk -f BEGIN { while (1) { "date" | getline foo printf "%s\r", foo } } it just…
Zombo
  • 1
  • 62
  • 391
  • 407
5
votes
6 answers

Sum of all rows of all columns - Bash

I have a file like this 1 4 7 ... 2 5 8 3 6 9 And I would like to have as output 6 15 24 ... That is the sum of all the lines for all the columns. I know that to sum all the lines of a certain column (say column 1) you can do like this: awk…
ayasha
  • 1,221
  • 5
  • 27
  • 46
5
votes
3 answers

Linux Terminal: Finding number of lines longer than x

I come to you with a problem that has me stumped. I'm attempting to find the number of lines in a file (in this case, the html of a certain site) longer than x (which, in this case, is 80). For example: google.com has (by checking with wc -l) has 7…
Doestovsky
  • 65
  • 1
  • 3
  • 8
5
votes
2 answers

How to filter on column value in bash

I'm facing a problem while trying to grep (filter) a logfile on the value of an integer. logfile.log: 2014-11-16 21:22:15 8 10.133.23.9 PROXIED ... 2014-11-16 21:22:15 1 163.104.40.133 authentication_failed DENIED ... 2014-11-16 21:22:15 15…
B3luT
  • 317
  • 2
  • 3
  • 9
5
votes
1 answer

Collapse sequential numbers to ranges in bash

I am trying to collapse sequential numbers to ranges in bash. For example, if my input file is 1 2 3 4 15 16 17 18 22 23 45 46 47 I want the output as: 1 4 15 18 22 23 45 47 How can I do this with awk or sed in a single line command? Thanks for…
arnstrm
  • 379
  • 3
  • 13
5
votes
4 answers

Bash- scramble characters contained in a string

So I have this function with the following output: AGsg4SKKs74s62# I need to find a way to scramble the characters without deleting anything..aka all characters must be present after I scramble them. I can only bash utilities including awk and…
Bruce Strafford
  • 173
  • 4
  • 15
5
votes
5 answers

grep: Keeping lines that has specific string in certain column

I am trying to pick out the lines that have certain value in certain column and save it to an output. I am trying to do this with grep. Is it possible? My data is looks like this: apple 5 abcdefd ewdsf peach 5 ewtdsfe wtesdf melon 1 …
user3557715
  • 123
  • 1
  • 1
  • 6
5
votes
1 answer

associative arrays in awk challenging memory limits

This is related to my recent post in Awk code with associative arrays -- array doesn't seem populated, but no error and also to optimizing loop, passing parameters from external file, naming array arguments within awk My basic problem here is simply…
Murgie
  • 137
  • 8
5
votes
4 answers

How to add line with spaces at beginning, and with backslash at end with sed?

I know the sed syntax to add a line after another line in a file, which is sed -i '/LINE1/a LINE2' FILE This adds LINE2 after LINE1 in FILE correct? How do I add a line with a backslash at the end? For example, from This is a a line \ Indented…
Chris F
  • 14,337
  • 30
  • 94
  • 192
5
votes
2 answers

how can we remove last 7 lines of file in unix

How to remove last 7 lines from the csv file using unix commands. For example - abc bffkms slds Row started 1 Row started 2 Row started 3 Row started 4 Row started 5 Row started 6 Row started 7 I want to delete the last 7 lines from above file.…
Pooja25
  • 316
  • 5
  • 9
  • 17
5
votes
3 answers

How to print a range of columns in a CSV in AWK?

With awk, I can print any column within a CSV, e.g., this will print the 10th column in file.csv. awk -F, '{ print $10 }' file.csv If I need to print columns 5-10, including the comma, I only know this way: awk -F, '{ print…
Village
  • 22,513
  • 46
  • 122
  • 163
5
votes
2 answers

BASH Script using awk to extract a key

I'm creating dkim private & public keys openssl genrsa -out dkim1024.key 1024 openssl rsa -in dkim1024.key -out dkim1024.pub -pubout -outform PEM I have a bash script using awk to extract a key file KEY=/usr/bin/awk…
elzwhere
  • 869
  • 1
  • 6
  • 13
5
votes
4 answers

Isolate the last octet from an IP address and put it in a variable

The following script: IP=`ifconfig en0 inet | grep inet | sed 's/.*inet *//; s/ .*//'` isolates the IP address from ipconfig command and puts it into the variable $IP. How can I now isolate the last octet from the said IP address and put it in a…
blackwire
  • 53
  • 1
  • 1
  • 3
5
votes
2 answers

Is there a way to define a user-defined function inside an awk statement which is inside a bash script?

I think the question speaks for itself. I am a beginner so please let me know if this is possible or not. If not, came you give me a better solution because my bash script depends heavily on certain awk statements but the majority of the script is…
Redson
  • 2,098
  • 4
  • 25
  • 50
5
votes
3 answers

How define array on awk

I am looking for simple array definition on awk by simple example. How to define array and use the elements of the array on awk language?
alex
  • 1,319
  • 3
  • 16
  • 28