Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

  • awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
  • mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
  • nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
  • gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

Resources:

Other StackExchange Resources:

Related tags:

  • (GNU's version of awk)
  • (A very old, pre-POSIX version also from AT&T)
  • (A different interpreter written by Mike Brennan)
  • (A kindred tool often mentioned in the same breath)
32722 questions
5
votes
2 answers

Calculate time difference using awk

I have the following input.txt file. I need to calculate the time difference of $2 and $3 and print difference in hours. P1, 2016-05-30 00:11:20, 2016-05-30 04:36:40 P2, 2016-05-30 00:07:20, 2016-05-30 04:32:31 I have the…
user3834663
  • 513
  • 2
  • 7
  • 17
5
votes
2 answers

AWK FNR==NR on an Empty File

I am running the following command which works great as long as their is content in the first file: awk -F, 'FNR==NR {a[tolower($1)]++; next} !a[tolower($1)]' OutSideSyncUsers.csv NewUsers.csv If the first file is empty, the command doesnt work. I…
moore1emu
  • 476
  • 8
  • 27
5
votes
1 answer

How to iterate over each output of awk with sh?

I have a test.sh file which awks through a text file test.bed: 1 31 431 1 14 1234 1 54 134 2 435 314 2 314 414 with while read line;do echo $line # ... more code ... # done < <(awk -F '\t' '$1 == 1 {print $0}' test.bed…
Niek de Klein
  • 8,524
  • 20
  • 72
  • 143
5
votes
1 answer

line too long error with columns

I'm trying to align text vertically with a delimiter in Geany text editor: idxMathExpress (MathArcCos _) = 120 idxMathExpress (MathArcSin _) = 130 idxMathExpress (MathArcTan _) = 140 I would like this block to be aligned like…
JeanJouX
  • 2,555
  • 1
  • 25
  • 37
5
votes
3 answers

Using AWK to place each word in a text file on a new line

I'm trying to use AWK to place every word within a text document on a new line. I don't really know how to use AWK but I've found some commands online which should solve my problem. I've tried the following commands: $ awk '{ for (i = 1; i <= NF;…
hjalpmig
  • 702
  • 1
  • 13
  • 39
5
votes
5 answers

shell insert a line every n lines

I have two files and I am trying to insert a line from file2 into file1 every other 4 lines starting at the beginning of file1. So for example: file1: line 1 line 2 line 3 line 4 line 5 line 6 line 7 line 8 line 9 line 10 file2: 50 43 21 output I…
jeabesli
  • 83
  • 1
  • 1
  • 7
5
votes
5 answers

How to use awk to test if a column value is in another file?

I want to do something like if ($2 in another file) { print $0 } So say I have file A.txt which contains aa bb cc I have B.txt like 00,aa 11,bb 00,dd I want to print 00,aa 11,bb How do I test that in awk? I am not familiar with the tricks of…
CuriousMind
  • 15,168
  • 20
  • 82
  • 120
5
votes
3 answers

Filter ldapsearch with awk/bash

By writing an ldapsearch command such as: ldapsearch -h ipaddress -p 389 -D "cn=func_01_acc,ou=admins,dc=akademia,dc=int" \ -w akademia.01 -b "ou=stud01,dc=akademia,dc=int" "(l=Torun)" sn cn telephonenumber -LLL | grep sn: | awk '{print $2 "|" $1}'…
J. Doe
  • 51
  • 1
  • 2
5
votes
9 answers

Shell script numbering lines in a file

I need to find a faster way to number lines in a file in a specific way using tools like awk and sed. I need the first character on each line to be numbered in this fashion: 1,2,3,1,2,3,1,2,3 etc. For example, if the input was this: line 1 line…
Douglas Anderson
  • 4,652
  • 10
  • 40
  • 49
5
votes
2 answers

Using pipes in an alias

I have this in my .bashrc: alias jpsdir="jps | awk '{print $1}' | xargs pwdx" but when I use jpsdir I get this output: pwdx: invalid process id: JvmName but running jps | awk '{print $1}' | xargs pwdx gives the correct results: 1234:…
dmc
  • 401
  • 4
  • 14
5
votes
2 answers

awk search on multiple fields of a multi line record file

I have a file with records that are of the form: SMS-MT-FSM-DEL-REP country: IN 1280363645.979354_PFS_1_1887728354 SMS-MT-FSM-DEL-REP country: IN 1280363645.729309_PFS_1_1084296392 SMS-MO-FSM country:…
adaptive
  • 263
  • 3
  • 7
5
votes
3 answers

How to use multiple passes with gawk?

I'm trying to use GAWK from CYGWIN to process a csv file. Pass 1 finds the max value, and pass 2 prints the records that match the max value. I'm using a .awk file as input. When I use the text in the manual, it matches on both passes. I can use…
Steve Kolokowsky
  • 423
  • 2
  • 12
5
votes
4 answers

Understanding the awk -f option in shebang line

I am reading someone's awk script. Starts with the header #!/usr/bin/env awk -f. The env command does not have a -f option. So, they must be passing the -f option for the awk command. I looked at the man page for awk. It says Awk scans each input…
Prachi
  • 528
  • 8
  • 31
5
votes
2 answers

Differences/merging two files

I have two lists of IP addresses. I need to merge them into three files, the intersection, those from list1 only and those from list2 only. can I do this with awk/diff or any other simple unix command? How? The files look like…
Zenet
  • 6,961
  • 13
  • 38
  • 45
5
votes
2 answers

In Awk, how to call a function by using a string name?

I'm seeking a way to call an awk function by name, i.e. by using a string that is user input. My goal is to replace a lot of code like this... if (text == "a") a(x) if (text == "b") b(x) if (text == "c") c(x) ... with any kind of way to dispatch to…
joelparkerhenderson
  • 34,808
  • 19
  • 98
  • 119