0

I have two files : filea and fileb which I don't want to sort (so I can't use comm).

filea    fileb
cat      cat
dog      elephant
cat      snake
rabbit   pony

If the content of filea is same as that of fileb then display what is in fileb, if the contents of files are different and file2 contains elephant then display ele, if snake, then display sna, if pony then display pon.

I tried using cmp:

if cmp -s filea fileb
then echo $"fileb"
fi

but it didn't display anything. I want the output to be in a column in a third file.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
t28292
  • 573
  • 2
  • 7
  • 12

3 Answers3

2

You seem to want to print fileb if it's the same as filea. If those are different, you want to print the first 3 characters of the lines that are not present in filea. The following should work for you:

$ cmp -s filea fileb && cat fileb || { grep -v -f filea fileb | cut -c-3; }
ele
sna
pon

(The paraphrased question above is, indeed, the explanation for the expression above.)

devnull
  • 118,548
  • 33
  • 236
  • 227
1

Using awk without sorting either file:

$ awk 'FNR==NR{a[$0];next}!($0 in a)' filea fileb
elephant
snake
pony

Print just the first 3 characters of the differences:

$ awk 'FNR==NR{a[$0];next}!($0 in a){print substr($0,1,3)}' filea fileb
ele
sna
pon

For the ouput to be in a new file use redirection:

$ awk 'FNR==NR{a[$0];next}!($0 in a){print substr($0,1,3)}' filea fileb > filec

EDIT:

FNR==NR       # Are we looking at the first file
a[$0]         # If so build an associative array of the file
next          # Go get the next line in the file
!($0 in a)    # In the second file now, check if the current line is in the array
print sub...  # If not print the first 3 characters from the current line
Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
0

AFAICR, cmp returns true if the files are the same. Therefore it is not surprising that the if statement printed nothing; the files are different. You need an else clause that finds the three words in file2 and truncates them to just three characters:

if cmp -s filea fileb
then cat fileb
else
    {
    grep elephant fileb
    grep snake fileb
    grep pony fileb
    } |
    sed 's/\(...\).*/\1/'
fi
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • The `sed` command remembers 3 characters (the first 3) and also matches any extra characters on the line, but replaces the whole lot with just the remembered ones. – Jonathan Leffler Jul 30 '13 at 20:18