Nawk - nawk out of space in tostring on (compare two files

Question

i am running nawk scripts on solaris system to get records of file1 not in file2 and find duplicate records in a while with the following scripts -

compare:

nawk 'NR==FNR{a[$0]++;next;} !a[$0] {print"line":" FNR $0}' file1 file2

duplicate:

nawk '{a[$0]++}END{for(i in a){if(a[i]-1)print i,a[i]}}' file1

in the middle of script i get an error message saying

nawk: out of space in tostring on record 971360

I am using a file having 2 million records.

what is your question? Please don't make us guess ;-) If script 1 is working, then use it. Good luck. — shellter, Feb 17 '14 at 15:50
Can the files be sorted? If so then using `comm` for the compare and `uniq` for identifying duplicates would be the normal approach. Post some sample input and expected output if you'd like help. — Ed Morton, Feb 17 '14 at 16:00

Akshay Hegde · Answer 1 · 2014-02-17T16:31:59.823

1

Correct your code, your double quote is mismatched also..

 nawk 'NR==FNR{a[$0];next;} !($0 in a){print "line:" FNR $0}' file1 file2

--edit--

for duplicate try this

nawk '{A[$0]++}END{for(i in A)if(A[i]>1)print i,A[i]}' file

!a[0] --> using a[$0] creates an extra empty array element for every $0 that does not exist in array a while reading the second file, so best thing is to do !($0 in a)

edited Feb 17 '14 at 16:31

answered Feb 17 '14 at 16:12

Akshay Hegde

16,536
2
22
36

Nawk - nawk out of space in tostring on (compare two files

1 Answers1