Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

  • awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
  • mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
  • nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
  • gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

Resources:

Other StackExchange Resources:

Related tags:

  • (GNU's version of awk)
  • (A very old, pre-POSIX version also from AT&T)
  • (A different interpreter written by Mike Brennan)
  • (A kindred tool often mentioned in the same breath)
32722 questions
5
votes
1 answer

floating point calculations in awk

I am surprised with behaviour of awk while performing floating point calculations. It lead me to wrong calculation on table data. $ awk 'BEGIN {print 2.3/0.1}' 23 <-- Ok $ awk 'BEGIN {print int(2.3/0.1)}' 22 <-- Wrong! $ awk 'BEGIN {print…
jkshah
  • 11,387
  • 6
  • 35
  • 45
5
votes
1 answer

awk: syntax error near unexpected token `('

I tried to assign the output of an awk command to a variable: USERS=$(awk '/\/X/ {print $1}' <(w)) This line is part of the following script: #!/bin/sh INTERFACE=$1 # The interface which is brought up or down STATUS=$2 # The new state of the…
orschiro
  • 19,847
  • 19
  • 64
  • 95
5
votes
2 answers

Awk greater than less than but within a set range

I have a script that basically evaulates 2 decimal numbers. if (( $(echo "$p $q" | awk '{ print ($1 < $2)}') )); then echo "Evaluation: Acceptable!" q is a decimal or number from user input. p is a calculated figure. Consequently, if p=1, and…
dat789
  • 1,923
  • 3
  • 20
  • 26
5
votes
1 answer

Parsing Json data columnwise in shell

When I run a command I get a response like this { "status": "available", "managed": true, "name":vdisk7, "support":{ "status": "supported" }, "storage_pool": "pfm9253_pfm9254_new", "id":…
The_Lost_Avatar
  • 992
  • 5
  • 15
  • 35
5
votes
3 answers

Need help in splitting filename and folder path using shell script

I am a novice in shell scripting. I need to split this following file structure as filename separate and folder path separate. In the filename, I don't need _ABF1_6, as it is not part of the filename. Also this _ABF1_6 changes from file path to path…
user2900008
5
votes
3 answers

AWK vs MySQL for Data Aggregation

In trying to figure out if AWK or MySQL is more efficient for processing log files and returning aggregate stats, I noticed the following behavior which doesn't make sense to me: To test this I used a file that had 4 columns and approximately 9…
Ballard
  • 53
  • 4
5
votes
3 answers

Find and replace pattern of fileA in fileC by fileB pattern

I have two files, fileA with a list of name : AAAAA BBBBB CCCCC DDDDD and another fileB with another list : 111 222 333 444 and a third fileC with some text : Hello AAAAA toto BBBBB dear "AAAAA" trird BBBBBB tuizf AAAAA dfdsf CCCCC So I need to…
Peter Dev
  • 73
  • 5
5
votes
6 answers

Find specific pattern and print complete text block using awk or sed

How can find a specific number in a text block and print the complete text block beginning with the key word "BEGIN" and ending with "END"? Basically this is what my file looks like: BEGIN A: abc B: 12345 C: def END BEGIN A: xyz B: 56789 C:…
edloaa
  • 117
  • 2
  • 7
5
votes
3 answers

How to print nth line from the pattern?

I am trying to make a script to summarize a file containing below logs in short format. Snippet of log : $ cat input.txt ffffff 1301 2012-08-29T03:13:33 clr crit Some serious problem cccc dddddd …
user2809888
5
votes
4 answers

reordering columns with AWK

I need to reorder the columns of this (tab-separated) data: 1 cat plays 1 dog eats 1 horse runs 1 red dog 1 the cat 1 the cat so that is prints like: cat plays 1 dog eats 1 horse runs 1 red dog 1 the cat…
owwoow14
  • 1,694
  • 8
  • 28
  • 43
5
votes
5 answers

converting hexadecimal to decimal from a csv-like txt file

i am using a bash terminal and i have a .txt file in which i got three columns of Hex numbers separated by a space. I'd like to convert them to decimal numbers. I tried the last command from here Converting hexadecimal to decimal using awk or sed…
merlinuxxx
  • 119
  • 1
  • 6
5
votes
5 answers

Awk script: How to prevent ARGV from being treated as an input file name

It seems that awk script considers ARGV[1] to ARGV[ARGC] as input files. Is there any way to make awk considering ARGV as simple arguments instead of an input file Example: test.awk #!/usr/bin/awk -f BEGIN {title=ARGV[2]} {if ($1=="AA") {print…
Alain
  • 53
  • 1
  • 3
5
votes
2 answers

awk variable assignment statement explanation needed

ok, straight to the point, here is the codes, I formatted the codes a little to make it easy to read: awk '{ t=$0 ; $0=t ; $0=// ; print "$0=// ; value of $0 is ",$0 $0=t ; $0=/./ ; print "$0=/./ ; value…
Kent
  • 189,393
  • 32
  • 233
  • 301
5
votes
4 answers

Print all lines between pattern where any line matches second pattern

I have an email log file like this: 2013-09-11 12:02:08 INFO: ------------------------------ 2013-09-11 12:02:08 INFO: Javamail session sending email 2013-09-11 12:02:08 INFO: Session properties: 2013-09-11 12:02:08 INFO: …
Billy
  • 53
  • 2
5
votes
7 answers

How can I remove leading and trailing zeroes from numbers with sed/awk/perl?

I have file like this: pup@pup:~/perl_test$ cat numbers 1234567891 2133123131 4324234243 4356257472 3465645768000 3424242423 3543676586 3564578765 6585645646000 0001212122 1212121122 0003232322 In the above file I want to remove the leading and…
EngineSense
  • 3,266
  • 8
  • 28
  • 44