I want to compare file2 to file1 by matching in the first 91 characters of each file and output the full record from file2 to file3. I'm new to Unix commands and just cant seem to figure this out.
Thanks in advance, Jeff
I want to compare file2 to file1 by matching in the first 91 characters of each file and output the full record from file2 to file3. I'm new to Unix commands and just cant seem to figure this out.
Thanks in advance, Jeff
You can compare two files using cmp
:
$ cmp file1 file2
file1 file2 differ: byte 92, line 1
If you want to only compare the first 91 bytes you can use the -n
switch:
$ cmp -n 91 file1 file2
If you want do something in that case (e.g,. copy the file to anther file), you can use bash's if
:
if cmp -n 91 file1 file2; then
cp file2 file3
fi
I generated dummy files as follows:
file1
A012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
B012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
C012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
D012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
E012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
F012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
file2
Z012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 1
B012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 2
T012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 3
D012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 4
E012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 5
F012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 6
Then I think you want this:
awk '
# Processing for file1, basically create associative array entry indexed by leftmost 91 characters
FNR==NR { f1[substr($0,1,91)]++; next }
# Processing for second file
f1[substr($0,1,91)] > 0
' file1 file2
Sample Output
B012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 2
D012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 4
E012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 5
F012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 6
Actually, I now think you might want precisely the other lines, if so, change this:
f1[substr($0,1,91)] > 0
to this:
! f1[substr($0,1,91)]