0

I want to compare file2 to file1 by matching in the first 91 characters of each file and output the full record from file2 to file3. I'm new to Unix commands and just cant seem to figure this out.

Thanks in advance, Jeff

  • 1
    You should show us some code, that You have tried to solve the problem Yourself. The question in this form violate rules, point 4. http://stackoverflow.com/help/on-topic – Michas Oct 28 '16 at 21:06
  • Sorry for the rule violation. The code I inherited was: comm file1 file2>file3 – jsouthworth Oct 28 '16 at 22:09
  • 1
    1. Edit question. 2. Show code. 3. Add input data. 4. Show expected output. 5. Show received output. – Michas Oct 28 '16 at 22:26
  • Please add sample input and your desired output for that sample input to your question. – Cyrus Oct 29 '16 at 05:56

2 Answers2

0

You can compare two files using cmp:

$ cmp file1 file2
file1 file2 differ: byte 92, line 1

If you want to only compare the first 91 bytes you can use the -n switch:

$ cmp -n 91 file1 file2

If you want do something in that case (e.g,. copy the file to anther file), you can use bash's if:

if cmp -n 91 file1 file2; then
    cp file2 file3
fi
ynimous
  • 4,642
  • 6
  • 27
  • 43
0

I generated dummy files as follows:

file1

A012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
B012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
C012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
D012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
E012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
F012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789

file2

Z012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 1
B012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 2
T012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 3
D012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 4
E012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 5
F012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 6

Then I think you want this:

awk '
   # Processing for file1, basically create associative array entry indexed by leftmost 91 characters
   FNR==NR { f1[substr($0,1,91)]++; next }

   # Processing for second file
   f1[substr($0,1,91)] > 0

   ' file1 file2

Sample Output

B012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 2
D012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 4
E012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 5
F012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 Line 6

Actually, I now think you might want precisely the other lines, if so, change this:

f1[substr($0,1,91)] > 0

to this:

! f1[substr($0,1,91)]
Mark Setchell
  • 191,897
  • 31
  • 273
  • 432