Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language (AWK stands for Aho, Weinberger, and Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, and each record is split into fields (by default, records are separated by the newline character and fields by horizontal whitespace.) Per record, each condition is checked and, if true, the commands in the action block are executed. Within the action block, fields are accessed by a 1-based index – e.g. $2 for the second field. If the condition is missing, the action block will always be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

  • awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
  • mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
  • nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
  • gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

Resources:

Other StackExchange Resources:

Related tags:

  • (GNU's version of awk)
  • (A very old, pre-POSIX version also from AT&T)
  • (A different interpreter written by Mike Brennan)
  • (A kindred tool often mentioned in the same breath)
32722 questions
5
votes
5 answers

Can you colorize specific lines that are grepped from a file?

I run a weekly CRONTAB that collects hardware info from 40+ remote servers and creates a weekly log file on our report server at the home office. I have a script that I run against this weekly file to output only specific status lines to my…
PCnetMD
  • 167
  • 4
5
votes
3 answers

pattern match and replace the string with if else loop

I have a file containing multiple lines starting with "1ECLI H--- 12.345 .....". I want to remove a space between I and H and add R/S/T upon iteration of the H pattern. for eg. H810 if repeated in consecutive three lines, it should get added with a…
amruta
  • 61
  • 4
5
votes
2 answers

List Gitlab runners by name only

I am trying to get a list of only the names of gitlab runners. So the output of gitlab-runner list 2>&1 is: Listing configured runners ConfigFile=/etc/gitlab-runner/config.toml default_runner …
Rickkwa
  • 2,197
  • 4
  • 23
  • 34
5
votes
2 answers

String conversion from numeric to string in awk

I have below awk command which give mentioned result as well. echo '1600000.00000000' '1600000.0000' | awk '{ print ($1 != $2) ? "true" : "false" }' Result is : - false As per numeric value the result given by the command is correct. But I want…
Amol Murkute
  • 45
  • 2
  • 7
5
votes
8 answers

Take every nth row from a file with groups and n is a given in a column

I have seen here and here on how to return every nth row; but my problem is different. A separate column in the file provides specifics about which nth element to return; which are different depending on the group. Here is a sample of the dataset…
deepseefan
  • 3,701
  • 3
  • 18
  • 31
5
votes
2 answers

awk match multiple pattern in column

What is the proper awk syntax to match multiple patterns in one column? Having a columnar file like this: c11 c21 c31 c12 c22 c32 c13 c23 c33 how to exclude lines that match c21 and c22 in the second column. With grep, one can do something like…
PedroA
  • 1,803
  • 4
  • 27
  • 50
5
votes
3 answers

Grab an IdentityFile from an ssh config based on a variable hostname via shell script

I'm writing a shell script where I need to obtain an IdentityFile from an ssh config file. The ssh config file looks like this: ​Host AAAA User aaaa IdentityFile /home/aaaa/.ssh/aaaaFILE IdentitiesOnly yes PubkeyAuthentication=yes …
Wimateeka
  • 2,474
  • 2
  • 16
  • 32
5
votes
3 answers

How to get the group by count in unix

I have a list of records as following Item1,200 Item1,200 Item3,900 Item2,500 Item2,800 Item1,600 Item4, Item5, Item4,100 Item5, Item5,444 My output should be "Please check the file as Item1 is greater than 2" With my awk command the output is…
Bobby
  • 320
  • 5
  • 23
5
votes
2 answers

awk gsub using variable for pattern matching

gsub(pattern, replacement, target): allows a variable to be used for pattern, but does not let me do regular expression. gsub(/pattern/, replacement, target): lets me do regular expression, but I cannot use a variable for the pattern. Is there a way…
ddjen11
  • 51
  • 1
  • 1
  • 4
5
votes
3 answers

Using awk to split line with multiple string delimiters

I have a file called pet_owners.txt that looks like: petOwner:Jane,petName:Fluffy,petType:cat petOwner:John,petName:Oreo,petType:dog ... petOwner:Jake,petName:Lucky,petType:dog I'd like to use awk to split the file using the delimiters: 'petOwner',…
Brinley
  • 591
  • 2
  • 14
  • 26
5
votes
1 answer

awk: printing a tab-delimited instead of a space-delimited file

I am using this awk command to print out certain columns of a file awk '{print $1,$2,$3,$5,log($11),$12}' inputfile > outputfile The inputfile is tab-delimited, and the outputfile is space-delimited. I need the outputfile to be tab-delimited as…
Abdel
  • 5,826
  • 12
  • 56
  • 77
5
votes
5 answers

How can I use awk or Perl to increment a number in a large XML file?

I have an XML file with the following line: I would like to increment this value by .04 and keep the format of the XML in place. I know this is possible with a Perl or awk script, but…
DC.
  • 53
  • 3
5
votes
3 answers

Pick up lines from a file based on line numbers in another file

I have two files - one contains the addresses (line numbers) and the other one data, like this: address file: 2 4 6 7 1 3 5 data file 1.000451451 2.000589214 3.117892278 4.479511994 5.484514874 6.784499874 7.021239396 I want to randomize the data…
hassan
  • 133
  • 1
  • 6
  • 17
5
votes
7 answers

How to delete all the lines after the last occurence of pattern?

i want to delete all the lines after the last occurence of pattern except the pattern itself file.txt honor apple redmi nokia apple samsung lg htc file.txt what i want honor apple redmi nokia apple what i have tried sed -i '/apple/q'…
j.doe
  • 65
  • 4
5
votes
4 answers

Duplicate a CSV column in Bash

I have an issue where a client needs to duplicate a column in a CSV file. The values are always going to be identical and unfortunately our API doesn't allow for duplicate columns to be specified in the JSON. For example I have the following column…
Predator
  • 138
  • 1
  • 7