Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

  • awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
  • mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
  • nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
  • gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

Resources:

Other StackExchange Resources:

Related tags:

  • (GNU's version of awk)
  • (A very old, pre-POSIX version also from AT&T)
  • (A different interpreter written by Mike Brennan)
  • (A kindred tool often mentioned in the same breath)
32722 questions
5
votes
1 answer

using awk in tcl script

I want to print a particular column number fields in a file while in TCL script. I tried with exec awk '{print $4}' foo where foo is filename, but it is not working as it gives error can't read "4": no such variable How can I do above awk in tcl…
ravi
  • 3,304
  • 6
  • 25
  • 27
5
votes
2 answers

Convert .gitignore to rsync merge filter include file? (with sed or awk)

I tried using rsync --filter=':+ .gitignore' (-/exclude works but not include) to no avail. Basically i just want to include the .ignore file in a script and upload everything in it with rsync to the remote. If anyone would have the skills to sed…
sabgenton
  • 1,823
  • 1
  • 12
  • 20
5
votes
1 answer

Awk merge the results of processing two files into a single file

I use awk to extract and calculate information from two different files and I want to merge the results into a single file in columns ( for example, the output of first file in columns 1 and 2 and the output of the second one in 3 and 4 ). The…
user2245731
  • 449
  • 3
  • 7
  • 16
5
votes
3 answers

Convert exponentials and rounding numbers in BASH

i have such a file 1.5000000000E-01 7.5714285714E+00 4.0000000000E-01 2.5000000000E-01 7.5714285714E+00 4.0000000000E-01 and i have to convert it to something like 0.15 7.57 0.40 i mean i want the numbers to be with only 2 decimals and not to be…
ayasha
  • 1,221
  • 5
  • 27
  • 46
5
votes
3 answers

Print package dependency tree

Using this file, I would like to print a tree of package dependencies, given a single base package. For example, take the Bash package @ bash # few lines removed requires: coreutils libintl8 libncursesw10 libreadline7 _update-info-dir cygwin I…
Zombo
  • 1
  • 62
  • 391
  • 407
5
votes
6 answers

How to add single quotes around columns using awk

Just wondering how can I add single quotes around fields, so I can import it to mysql without warnings or errors. I have a csv file with lots of content. 16:47:11,3,r-4-VM,250000000.,0.50822578824,131072,0,0,0,0,0 Desired output…
Deano
  • 11,582
  • 18
  • 69
  • 119
5
votes
4 answers

How to delete an entire row if a specific column contains a zero?

I need to delete the rows which contains 0 value in column number 5. Before: 1QWO 10 5 45 100 7.5 1R0R 6 3 15 100 8.5 1R5M 4 0 6 0 6.5 1R8S 4 0 6 0 6 1R9L 2 …
user2176228
  • 327
  • 3
  • 10
5
votes
2 answers

sort a file based on a column in another file

I have two files both in the format of: loc1 num1 num2 loc2 num3 num4 The first column is the location and I want to use the order of the locations in the first file to sort the second file so that I can put the two files together where the…
olala
  • 4,146
  • 9
  • 34
  • 44
5
votes
3 answers

awk, print lines which start with four digits

I want to print all lines from a file which begin with four digits. I tried this allredy but it does not work: cat data.txt | awk --posix '{ if ($1 ~ /^[0-9]{4}/) print $1}' No output is generated The next line prints all lins which start with a…
Lukas Banach
  • 51
  • 1
  • 1
  • 3
5
votes
3 answers

Picking up CSV fields by name using awk

Suppose I have a CSV file with headers of the following form: Field1,Field2 3,262000 4,449000 5,650000 6,853000 7,1061000 8,1263000 9,1473000 10,1683000 11,1893000 I would like to write an awk script which will take a comma-separated list of field…
merlin2011
  • 71,677
  • 44
  • 195
  • 329
5
votes
2 answers

Linux - numerical sort then overwrite file

I have a csv file with a general format date, 2013.04.04, 2013.04.04, 2012.04.02, 2013.02.01, 2013.04.05, 2013.04.02, a script I run will add data to this file which will not necessarily be in date order. How can I sort the file into date order…
moadeep
  • 3,988
  • 10
  • 45
  • 72
5
votes
2 answers

Command to replace specific column of csv file for first 100 rows

Following command is replacing second column with value e in a complete csv file, but what if i want to replace only in first 100 rows. awk -F, '{$2="e";}1' OFS=, file Rest of the rows of csv file should be intact..
user752590
  • 111
  • 5
  • 12
  • 29
5
votes
3 answers

Using awk and df (disk free) to show only mount name and space used

What would be the correct CL sequence to execute a df -h and only print out the mount name and used space (percentage)? I'm trying to do a scripted report for our servers. I tried df -h | awk '{print $1 $4}' which spits out $df -h | awk '{print $1…
Rick
  • 591
  • 1
  • 9
  • 23
5
votes
5 answers

how to insert a newline \n after x numbers of words, with AWK or Sed

I have this in one line: We were born in the earth beyond the land I want it in 3 words lines, to be like this: We were born in the earth beyond the land
user2275669
  • 53
  • 1
  • 3
5
votes
5 answers

Move line(s) to follow another line in a file

I got a file that has a line in the file like this: check=('78905905f5a4ed82160c327f3fd34cba') I'd like to be able to move this line to follow a line that looks like this: files=('somefile.txt') The array though at times that can span multiple…
Todd Partridge 'Gen2ly'
  • 2,258
  • 2
  • 19
  • 18