Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

  • awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
  • mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
  • nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
  • gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

Resources:

Other StackExchange Resources:

Related tags:

  • (GNU's version of awk)
  • (A very old, pre-POSIX version also from AT&T)
  • (A different interpreter written by Mike Brennan)
  • (A kindred tool often mentioned in the same breath)
32722 questions
49
votes
3 answers

awk OR statement

Does awk have an OR statement i.e given the following snippet: awk '{if ($2=="abc") print "blah"}' Is it possible to add an OR statement so that if $2==abc OR def -> print?
Numpty
  • 1,461
  • 5
  • 19
  • 28
49
votes
3 answers

How can I make awk not use scientific notation when printing small values?

In the following awk command awk '{sum+=$1; ++n} END {avg=sum/n; print "Avg monitoring time = "avg}' file.txt what should I change to remove scientific notation output (very small values displayed as 1.5e-05) ? I was not able to succeed with the…
Manuel Selva
  • 18,554
  • 22
  • 89
  • 134
48
votes
3 answers

Pass parameter to an awk script file

If I want to pass a parameter to an awk script file, how can I do that ? #!/usr/bin/awk -f {print $1} Here I want to print the first argument passed to the script from the shell, like: bash-prompt> echo "test" | ./myawkscript.awk…
Mahmoud Emam
  • 1,499
  • 4
  • 20
  • 37
47
votes
3 answers

Remove a specific character using awk or sed

I have a command output from which I want to remove the double quotes ". Command: strings -a libAddressDoctor5.so |\ grep EngineVersion |\ awk '{if(NR==2)print}' |\ awk '{print$2}' Output: EngineVersion="5.2.5.624" I'd like to know how to remove…
AabinGunz
  • 12,109
  • 54
  • 146
  • 218
47
votes
19 answers

Removing trailing / starting newlines with sed, awk, tr, and friends

I would like to remove all of the empty lines from a file, but only when they are at the end/start of a file (that is, if there are no non-empty lines before them, at the start; and if there are no non-empty lines after them, at the end.) Is this…
ELLIOTTCABLE
  • 17,185
  • 12
  • 62
  • 78
47
votes
7 answers

generating frequency table from file

Given an input file containing one single number per line, how could I get a count of how many times an item occurred in that file? cat input.txt 1 2 1 3 1 0 desired output (=>[1,3,1,1]): cat output.txt 0 1 1 3 2 1 3 1 It would be great, if the…
Javier
  • 1,131
  • 4
  • 17
  • 22
47
votes
11 answers

Extraction of data from a simple XML file

I've a XML file with the contents: programming I need a way to extract what is in the tags, programmin in this case. This should be done on linux…
Zacky112
  • 8,679
  • 9
  • 34
  • 36
47
votes
6 answers

Awk/Unix group by

have this text file: name, age joe,42 jim,20 bob,15 mike,24 mike,15 mike,54 bob,21 Trying to get this (count): joe 1 jim 1 bob 2 mike 3 Thanks,
C B
  • 12,482
  • 5
  • 36
  • 48
46
votes
4 answers

Assign AWK result to variable

This should be pretty straightfoward and I don't know why I am struggling with it. I am running the following psql command from within a shell script in order to find out whether all indexes have been dropped before inserting data. INDEXCOUNT=$(psql…
user739866
  • 891
  • 1
  • 9
  • 18
46
votes
6 answers

How to get the first column of every line from a CSV file?

How do get the first column of every line in an input CSV file and output to a new file? I am thinking using awk but not sure how.
Junba Tester
  • 801
  • 2
  • 9
  • 15
46
votes
3 answers

Using awk to print characters of specific index on a line

Alright, so I know it is quite simple to print specific arguments of a line using $: $ cat file hello world $ awk '{print $1}' file hello But what if I want to print chars 2 through 8? or 3 through 7? Is that possible with awk?
Jordan
  • 2,070
  • 6
  • 24
  • 41
45
votes
10 answers

Convert all number abbreviations to numeric values in a text file

I'd like to convert all number abbreviations such as 1K, 100K, 1M, etc. in a text file into plain numeric values such as 1000, 100000, 1000000, etc. So for example, if I have the following text file: 1.3K apples 87.9K oranges 156K mangos 541.7K…
chiappa
  • 1,298
  • 10
  • 21
45
votes
4 answers

Why does "1" in awk print the current line?

In this answer, awk '$2=="no"{$3="N/A"}1' file was accepted. Note the 1 at the end of the AWK script. In the comments, the author of the answer said [1 is] a cryptic way to display the current line. I'm puzzled. How does that work?
Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
44
votes
6 answers

Using grep to search for hex strings in a file

Does anyone know how to get grep, or similar tool, to retrieve offsets of hex strings in a file? I have a bunch of hexdumps (from GDB) that I need to check for strings and then run again and check if the value has changed. I have tried hexdump and…
user650649
  • 495
  • 1
  • 5
  • 10
44
votes
7 answers

How to print 5 consecutive lines after a pattern in file using awk

I would like to search for a pattern in a file and prints 5 lines after finding that pattern. I need to use awk in order to do this. Example: File Contents: . . . . ####PATTERN####### #Line1 #Line2 #Line3 #Line4 #Line5 . . . How do I parse through…
tomkaith13
  • 1,717
  • 4
  • 27
  • 39