0

Trying to figure out how to store a list as a variable (array?) and use it in with awk.

I have a file like such:

Jimmy
May31
John
June19
Paul
Aug15
Mark
Sept1
David
Nov15

I want to use awk to search my file and remove three names and the line following each of those names. So the final file should only contain 2 names (and birthdays).

I can do this with:

awk '/Jimmy|Mark|David/{n=2}; n {n--; next}; 1' < file

But is there a way to store the "Jimmy|Mark|David" list in the above command as a variable/array and do the same thing. (The real project I've working on has a much longer list to match in a much bigger file).

Thanks!

user2414840
  • 721
  • 1
  • 7
  • 15
  • You can pass variables into awk using `-v`, if that will achieve your goal. See [The GNU Awk User's Guide](http://www.delorie.com/gnu/docs/gawk/gawk_165.html) for some info on how that and other very useful Awk flags work – Davy M Aug 09 '17 at 22:43
  • I haven't been able to figure out the -v and ~ syntax except for a very simple example – user2414840 Aug 09 '17 at 22:45

3 Answers3

2

You can do it with the -v/--assign option:

awk -v pat='Jimmy|Mark|David' '$0~pat {n=2}; n {n--; next}; 1' birthdays

and then invoke regex comparison manually with ~ operator on the complete line.

Alternatively, if you have a long list of names to filter out in a file, grep with -f would probably be much faster option (see here). For example:

$ cat names
Jimmy
Mark
David

$ paste - - <birthdays | grep -vFf names | tr '\t' '\n'
John
June19
Paul
Aug15
randomir
  • 17,989
  • 1
  • 40
  • 55
0

You can get the list in a variable like this:

LIST=$(cat list.txt | tr "\n" "|")

and then use @randomir 's answer

awk -v pat=$LIST '$0~pat {n=2}; n {n--; next}; 1' birthdays

if I put your list:

Jimmy
John
Paul
Mark
David

into the file list.txt

LIST=$(cat list.txt | tr "\n" "|")

will output

Jimmy|John|Paul|Mark|David

providing you don't add a linebreak at the end of the last line

MrE
  • 19,584
  • 12
  • 87
  • 105
  • MrE can you spell out the full list? I am not reading it from a file and I can't seem to adapt your example – user2414840 Aug 09 '17 at 22:50
  • list.txt is the list you provided – MrE Aug 09 '17 at 22:50
  • the list I provided is the file to be searched for matches "Jimmy, Mark and David"...so the variable is Jimmy, mark, david – user2414840 Aug 09 '17 at 22:51
  • but it seems to me, the way you are looking at this, you want to just remove one \n every other line, and then you'll have `Name Birthday` on the same line. Then use awk to read whatever you want for each line. – MrE Aug 09 '17 at 22:54
  • Thanks. If I modify the list to include only the three names, it does what I want it to – user2414840 Aug 09 '17 at 23:01
  • @user2414840, if you have a list of names in a file, have a look at my (updated) answer for an alternative (faster/simpler) solution. It uses only `grep`, and it should be faster than `awk`, particularly for long lists of names. – randomir Aug 11 '17 at 18:41
0

Seems like it would be easier to do this:

Patch 2 lines together cat file | paste - -

then use awk to do what you need to do

$ cat list.txt| paste - -                                                                                                                                                                          
Jimmy   May31
John    June19
Paul    Aug15
Mark    Sept1
David   Nov15
MrE
  • 19,584
  • 12
  • 87
  • 105
  • you can do the same in awk by the way, but it's more obscure: `awk 'ORS=NR%2?" ":"\n"' ` – MrE Aug 09 '17 at 23:01