2

I have two files separated by tabs. Comparing files by the first field, I need to print the line where the field does not match. But the line to be printed is from the file (file1)

File1:

adu adu noun    singular    n/a n/a nominative
aduink  adu noun    plural  1pl n/a nominative
adum    adu noun    singular    1s  n/a nominative

File2:

adu adu noun    singular    n/a n/a nominative
aduink  adu noun    plural  1pl n/a nominative
xxadum  adu noun    singular    1s  n/a nominative

Desired output:

adum    adu noun    singular    1s  n/a nominative

What I'm thinking:

awk 'FNR==NR{a[$1]=$0;next} !($1 in a)' file1 file2

But I need to print, the line from file (file1) not from file (file2). And I can not change the order to process files

Firefly
  • 449
  • 5
  • 20
  • Your `FNR==NR` expression gets run on the first file listed after the awk script, in this case `file1`. That means that your subsequent expression, `!($1 in a)`, is evaluated against lines in `file2`. If you want to store `$1` of `file2` in the array and then compare lines of `file1` against the array, simply swap the order of the files on your awk command line. – ghoti Feb 26 '16 at 12:38

4 Answers4

2

I don't understand why you can't change the files order (that is more simple), but you with the same order, you can do that:

awk 'NR==FNR{ a[$1]=$0; next }
     { delete a[$1] }
     END{ for (x in a) print a[x] }' file1 file2

The idea is to delete all items at index $1 when the second file is processed. Then at the end, you only need to print the remaining items.

Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
1

Why don't you interchange the files in the argument that you are passing to awk,

awk 'FNR==NR{a[$1]=$0;next} !($1 in a)' file2 file1
                                          |     |
                                         arg1  arg2
sat
  • 14,589
  • 7
  • 46
  • 65
1

If you can't change the file order when awk is called, just change it inside awk:

awk 'BEGIN{t=ARGV[1]; ARGV[1]=ARGV[2]; ARGV[2]=t} FNR==NR{a[$1];next} !($1 in a)' file1 file2

That way you don't have to store either file in memory.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
0

late to the party but here is a simpler way to do this

$ join -v1 file1 file2

adum adu noun singular 1s n/a nominative

that is, to suppress joined lines and print the unpaired lines from first file. By default join is by first field.

karakfa
  • 66,216
  • 7
  • 41
  • 56