-4

I have two files f1.txt and f2.txt. I want to able to take remove rows within File 1 (f1.txt) if its first column has a matching entry in File2 (f2.txt). f2 has only 1 column per line where as each row of f.txt will have two or more columns. Here is an example:

cat f1.txt

1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 1000
2, 100, 200, 300, 400
3, 100, 2000, 3000
4, 400, 500 
5, 500, 600, 700, 800, 900, 1000

cat f2. txt

2
4

Here is the desired output:

1, 10, 20, 30, 40
3, 100, 2000, 3000, 400
5, 500, 600, 700, 800
Jotne
  • 40,548
  • 12
  • 51
  • 55
user3803714
  • 5,269
  • 10
  • 42
  • 61
  • 3
    read the column from f2.txt into a set, then for each line in f1.txt, split out the first column and see if its in the set. we don't write your code, just suggest how to improve it. – tdelaney Dec 18 '14 at 01:40
  • 1
    Where did the 6th and subsequent fields from lines 1 and 5 go? Where did the `400` at the end of line 3 come from? Put just a TINY bit of effort into asking the question. – Ed Morton Dec 18 '14 at 05:03
  • 1
    Try this: `awk 'FNR==NR{a[$1];next} {p=1;c=+$1;for (i in a) if(c==i) p=0} p' f2.txt f1.txt` – Jotne Dec 18 '14 at 07:29

1 Answers1

2

Modify the pattern file f2.txt, as so :

sed -i -e 's/^/\^/;s/$/\\b/' file1

f2.txt will look like

^2\b
^4\b
etc.

Then compare the files with grep:

grep -vf f2.txt f1.txt
buydadip
  • 8,890
  • 22
  • 79
  • 154
  • 1
    Any time you write a shell loop to manipulate text you have the wrong approach. Also - what do you think will happen given your solution if f2.txt has a line with a 1 in it and f1.txt has a line that starts with, say, 10? – Ed Morton Dec 18 '14 at 05:05
  • 1
    @Ed Morton You're right I forgot the comma, assuming that the numbers are separated by commas. What would you suggest would be a better approach, using awk?. – buydadip Dec 18 '14 at 05:11
  • 1
    yes, awk is the tool that the guys who invented shell invented for shell to call to manipulate text. The whole script can be done concisely, efficiently, and robustly as just `awk -F, 'NR==FNR{a[$0];next} !($1 in a)' f2.txt f1.txt` – Ed Morton Dec 18 '14 at 05:52
  • 1
    @EdMorton check my edit, do you approve? It works fine if f2.txt is a single column, as OP stated. – buydadip Dec 18 '14 at 07:35
  • I initially said "Looks good to me." but now I realise it'll match on any field, not just the first field, so it won't work. – Ed Morton Dec 18 '14 at 14:15
  • @EdMorton OK, I'm sure my new solution works...its the best I can do without copying the awk solution you provided. – buydadip Dec 18 '14 at 17:36