Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

  • awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
  • mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
  • nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
  • gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

Resources:

Other StackExchange Resources:

Related tags:

  • (GNU's version of awk)
  • (A very old, pre-POSIX version also from AT&T)
  • (A different interpreter written by Mike Brennan)
  • (A kindred tool often mentioned in the same breath)
32722 questions
40
votes
5 answers

Remove the first word in a text stream

How would I remove the first word from each line of text in a stream? For example, $ cat myfile some text 1 some text 2 some text 3 I want: $ cat myfile | magiccommand text 1 text 2 text 3 How would I go about this using Bash? I could use awk…
Trcx
  • 4,164
  • 6
  • 30
  • 30
40
votes
3 answers

How to print regexp matches using awk?

Is there a way to print a regexp match (but only the matching string) using awk command in shell?
Istvan
  • 7,500
  • 9
  • 59
  • 109
40
votes
5 answers

Use awk to find first occurrence only of string after a delimiter

I have a bunch of documents that all have the line, Account number: 123456789 in various locations. What I need to do is be able to parse through the files, and find the account number itself. So, awk needs to look for Account number: and return the…
DrDavid
  • 904
  • 3
  • 9
  • 18
40
votes
5 answers

Regex to batch rename files in OS X Terminal

I'm after a way to batch rename files with a regex i.e. s/123/onetwothree/g I recall i can use awk and sed with a regex but couldnt figure out how to pipe them together for the desired output.
user370507
  • 477
  • 2
  • 12
  • 20
40
votes
7 answers

GROUP BY/SUM from shell

I have a large file containing data like this: a 23 b 8 a 22 b 1 I want to be able to get this: a 45 b 9 I can first sort this file and then do it in Python by scanning the file once. What is a good direct command-line way of doing this?
Legend
  • 113,822
  • 119
  • 272
  • 400
39
votes
4 answers

Deleting the first two lines of a file using BASH or awk or sed or whatever

I'm trying to delete the first two lines of a file by just not printing it to another file. I'm not looking for something fancy. Here's my (failed) attempt at awk: awk '{ (NR > 2) {print} }' myfile That throws out the following error: awk: { NR >…
Amit
  • 7,688
  • 17
  • 53
  • 68
39
votes
6 answers

Using AWK to filter out column with numerical ranges

I'm relatively new to BASH and I'm trying to use awk to filter out column 1 data based on the 4th column of a text file. If the 4th column of data matches the range of x, then it'll output column 1 data. "x" is suppose to be a range of numbers 1-10…
BurN135
  • 391
  • 1
  • 3
  • 4
39
votes
5 answers

How to format console output in columns

I have the following text file: [master]$ cat output.txt CHAR.L 96.88 -6.75 (-6.49%) MXP.L 12.62 -1.00 (-7.41%) NEW.L 7.88 -0.75 (-8.57%) AGQ.L 17.75 -0.62 (-3.40%) RMP.L 13.12 -0.38 (-2.75%) RRR.L 3.35 -0.20 (-5.71%) RRL.L…
ktec
  • 2,453
  • 3
  • 26
  • 32
39
votes
7 answers

How to cut a string after a specific character in unix

So I have this string: $var=server@10.200.200.20:/home/some/directory/file I just want to extract the directory address meaning I only want the bit after the ":" character and get: /home/some/directory/file thanks. I need a generic command so the…
canecse
  • 1,772
  • 3
  • 16
  • 20
39
votes
5 answers

Rename files recursively Mac OSX

Attempting to rename a bunch of files. I can rename any instances of foo with bar in the current directory with: ls . | awk '{print("mv "$1" "$1)}' | sed 's/foo/bar/2' | /bin/sh What can I add to make it recursive? Edit/My solution I don't…
Adam Waite
  • 19,175
  • 22
  • 126
  • 148
39
votes
3 answers

How to save the output of this awk command to file?

I wanna save this command to another text: awk '{print $2}' it extract's from text. now i wanna save output too another text. thanks
user2034825
  • 393
  • 1
  • 3
  • 4
39
votes
4 answers

Invoking a script, which has an awk shebang, with parameters (vars)

I have an awk script that I have defined thus: #!/usr/bin/env awk BEGIN { if (!len) len = 1; end = start + len } { for (i = start; i < end; i++) { print $1 } } I have saved it as columns and chmod +x'd it. I want invoke it so that start and end are…
Ollie Saunders
  • 7,787
  • 3
  • 29
  • 37
38
votes
4 answers

Number of fields returned by awk

Is there a way to get awk to return the number of fields that meet a field-separator criteria? Say, for instance, the file contains: a b c d So, awk -F=' ' | should return 4.
Sriram
  • 10,298
  • 21
  • 83
  • 136
38
votes
2 answers

Summing values of a column using awk command

I want to sum the values of all rows in the column 3. How can I do this? Input: chr19 10 11 chr19 12 15 chr19 11 29 chr19 a0 20 Expected output: 75
Rashedul Islam
  • 879
  • 1
  • 10
  • 23
38
votes
3 answers

Using output of awk to run command

I am brand new to shell scripting and cannot seem to figure out this seemingly simple task. I have a text file (ciphers.txt) with about 250 lines, and I would like to use the first column of each line as an argument in a command. Any help would be…
nLee
  • 1,320
  • 2
  • 10
  • 21