Questions tagged [gawk]

gawk (short for GNU awk) is a free implementation of awk with manifold useful extensions.

gawk (short for GNU awk) is a free implementation of awk with manifold useful extensions.

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia

See also .

Reference

981 questions
2
votes
1 answer

Trim first N bases in multi fasta file with awk and print with max width format

Background The multi fasta format contains several record of sequences, each record begins with a single-line description, followed by several lines of sequence (RNA, DNA, protein). The description line has greaterthan symbol in the beginning,…
Jose Ricardo Bustos M.
  • 8,016
  • 6
  • 40
  • 62
2
votes
1 answer

awk to ignore double quote and compare two files

I have two input file FILE 1 123 125 123 129 and file 2 "a"|"123"|"anc" "b"|"124"|"ind" "c"|"123"|"su" "d"|"122"|"aus" OUTPUT: "b"|"124"|"ind" "d"|"122"|"aus" now how can i compare and print the difference of $1 from file1 and $2 from file2. i'm…
bongboy
  • 147
  • 1
  • 15
2
votes
2 answers

Awk, little endian order and 4 hex digits

I suppose that I have a decimal number, e.g., 97254 ---> 00017BE6 (hex value) using: echo "" | awk '{printf("%08X", 97254)}' Now, if I want to convert hex number (00017BE6, in this case) into 4 numbers of 2 digits (max 8 numbers in input) in little…
mikilinux
  • 89
  • 1
  • 9
2
votes
4 answers

Extract columns from a CSV file using Linux shell commands

I need to "extract" certain columns from a CSV file. The list of columns to extract is long and their indices do not follow a regular pattern. So far I've come up with a regular expression for a comma-separated value but I find it frustrating that…
SJU
  • 187
  • 1
  • 7
2
votes
1 answer

What is the rule of converting subscript of awk array?

I know, the subscript of awk array must be a string. [root@localhost]# awk 'END {array[A0]="empty"; print array[""]}' empty So in above command line, because A0 is not quoted as "A0" , it stands for a variable. Because the variable A0 hasn't…
Nan Xiao
  • 16,671
  • 18
  • 103
  • 164
2
votes
3 answers

gawk/sed: find a line and replace the 3rd column

I have a file: rs4648841 chr1 2365885 -- A T 0.40095 0.228978043022122 chr1:2523811 rs4648843 chr1 2366316 -- T C 0.15694 0.5736208829426915 chr1:2523811 rs61763906 chr1 2366517 -- A G 0.07726 0.5566728930776897 …
Alina
  • 2,191
  • 3
  • 33
  • 68
2
votes
2 answers

How to use AWK to calculate interleaved fields?

How can I use AWK to calculate some fields on different rows with the pattern like below? (column x, row m) + (column y, row (m+n)) Here's a data file to calculate for example, 1 2 3 4 5 6 7 8 .. => 1+4 3+6 5+8 ..
sof
  • 9,113
  • 16
  • 57
  • 83
2
votes
1 answer

awk: keep records with the highest value, comparing those that share other fields

I'm trying to write an awk script that keeps the records with a highest value in a given field, but only comparing records that share two other fields. I'd better give an example -- this is the input.txt: X A 10.00 X A 1.50 X B 0.01 X B 4.00 Y C…
xgrau
  • 299
  • 1
  • 2
  • 11
2
votes
3 answers

deleting header lines with no following content lines using awk

I think I've done this a couple of times but I can't do it this morning. I have a file like this for example. (this is the result of comparison of many files using foreach and diff, with file names enclosed with ### pattern) << file gg >> ###…
Chan Kim
  • 5,177
  • 12
  • 57
  • 112
2
votes
4 answers

AWK print based on FILENAME pattern

I have a directory of files with filenames of the form file000.txt to filennn.txt. I would like to be able to specify a range of file names and print the content of those files based on a match. I have achieved it with a single file pattern: $ gawk…
pelorus32
  • 23
  • 1
  • 4
2
votes
1 answer

record the lines in which each word in a given file appears using awk

Having a few problems doing this. The output needs to be of the following format: on each line, a word is first printed, followed by a colon “:”, then a space, and then the list of the line numbers where the word appears (separated by comma). If a…
chomp
  • 125
  • 9
2
votes
2 answers

Awk substring a single character

Here is columns.txt aaa bbb 3 ccc ddd 2 eee fff 1 3 3 g 3 hhh i jjj 3 kkk ll 3 mm nn oo 3 I can find the line where second column starts with "b": awk '{if(substr($2,1,1)=="b") {print $0}}' columns.txt I can find the line where second…
AWE
  • 4,045
  • 9
  • 33
  • 42
2
votes
3 answers

Extra newline coming from somewhere

Can someone explain what I'm doing wrong and how to do it better. I have a file consisting of records with field separator "-" and record separator "\t" (tab). I want to put each record on a line, followed by the line number, separated by a tab. The…
user1812457
2
votes
2 answers

AWK to process compressed files and printing original (compressed) file names

I would like to process multiple .gz files with gawk. I was thinking of decompressing and passing it to gawk on the fly but I have an additional requirement to also store/print the original file name in the output. The thing is there's 100s of .gz…
msciwoj
  • 772
  • 7
  • 23
2
votes
3 answers

String together awk commands

I'm writing a script that searches a file, gets info that it then stores into variables, and executes a program that I made using those variables as data. I actually have all of that working, but I need to take it a step further: What I currently…
ChangeJar
  • 163
  • 1
  • 1
  • 10