How to compare two csv files in UNIX and create delta ( modified/ new records )

Question

I have two csv files old.csv and new.csv. I need only new or updated records from new.csv file. Delete records from new.csv if that is exists in old.csv.

old.csv

"R","abc","london","1234567"
"S","def","london","1234567"
"T","kevin","boston","9876"
"U","krish","canada","1234567"

new.csv

"R","abc","london","5678"
"S","def","london","1234567"
"T","kevin","boston","9876"
"V","Bell","tokyo","2222"

Output in new.csv

"R","abc","london","5678"     
"V","Bell","tokyo","2222"

Note : if All records are same in new.csv then new.csv should be empty

James Brown · Accepted Answer · 2017-03-22T16:46:18.757

Use for example grep:

$ grep -v -f old.csv new.csv # > the_new_new.csv 
"R","abc","london","5678"
"V","Bell","tokyo","2222"

and:

$ grep -v -f old.csv old.csv
$                            # see, no differencies in 2 identical files

man grep:

  -f FILE, --file=FILE
          Obtain  patterns  from  FILE,  one  per  line.   The  empty file
          contains zero patterns, and therefore matches nothing.   (-f  is
          specified by POSIX.)

  -v, --invert-match
          Invert the sense of matching, to select non-matching lines.  (-v
          is specified by POSIX.)

Then again, you could use awk for it:

$ awk 'NR==FNR{a[$0];next} !($0 in a)' old.csv new.csv
"R","abc","london","5678"
"V","Bell","tokyo","2222"

Explained:

awk '
NR==FNR{            # the records in the first file are hashed to memory
    a[$0]
    next
} 
!($0 in a)          # the records which are not found in the hash are printed
' old.csv new.csv   # > the_new_new.csv

Your answer without an `awk`? Weird :), also I think we should add this as part of the `awk` Wiki FAQ, there is a question of this sort every day of the week! May be Ed Morton can have a say in this. — Inian, Mar 22 '17 at 16:40

score 5 · Answer 2 · answered Mar 22 '17 at 23:17

5

When the files are sorted:

comm -13 old.csv new.csv

When they are not sorted, and sorting is allowed:

comm -13 <(sort old.csv) <(sort new.csv)

answered Mar 22 '17 at 23:17

Walter A

19,067
2
23
43

How to compare two csv files in UNIX and create delta ( modified/ new records )

2 Answers2