-1

I'd like to get rid of a line with a pattern containing:

  1. CE1(2or8 # CE1(number 2 or 8
  2. CE2(-1-17-2or8 # CE2(any number from -1 to 17, a dash, number 2 or 8

and 6 lines before that and 1 line after that.

grep -B6 -A1 'CE1([28]\|CE2([-1-17]-[28]' file

This attempt seems to match my pattern (does it do what I explicitly described?) but I was thinking of using reverse option to get rid of that pattern search from my file. Is it possible? It does not seem to work.

2 Answers2

2

Not a complete answer, but some explanations:

A character class matches only one character. The hyphen in a character class, when it doesn't represent a literal hyphen (at the first position, at the end, when escaped or immediately after ^), defines a range of characters, but not a range of numbers. (make some tries with the ascii table on a corner to well understand.)

[-1-17] matches one of these characters that can be:

  • a literal hyphen (because at the beginning)
  • a character in the range 1-1 (so 1)
  • the character 7

To match an integer between -1 and 17, you need:

\(-1\|1[0-7]\|[0-9]\)
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125
1

The simplest and most robust (since it works even when the skipped range includes lines that match the regexp or when the range runs off the start/end of the input file) approach, IMHO, is 2 passes - the first to identify the lines to be skipped and the second to skip those lines:

$ cat file
a 1
b 2
c 3
d 4
e 5
f 6
g 7
h 8
i 9

$ awk -v b=3 -v a=1 'NR==FNR{if (/f/) for (i=NR-b;i<=NR+a;i++) skip[i]; next} !(FNR in skip)' file file
a 1
b 2
h 8
i 9

Just change /f/ to /<your regexp of choice>/ and set the b(efore) and a(fter) values as you like.

As for your particular regexp, you didn't provide any sample input and expected output for us to test against but I THINK what you want might be:

awk -v b=6 -v a=1 'NR==FNR{if (/CE(1|2(-1|[0-9]|1[0-7])-)[28]/) for (i=NR-b;i<=NR+a;i++) skip[i]; next} !(FNR in skip)' file file
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • The sample is, e.g. CE1(2-14;7-14) or CE2(7-8;14-14) In CE1, the first number in parenthesis has to be either 2 or 8. In the case of CE2, the second number (the one after the dash) has to either 2 or 8, the first can take up any value from -1 to 17 (increment +1). – almighty_dollar Feb 28 '16 at 18:39
  • No, edit your question to include a sample of the input file you would like to run the tool against and the output you want to get given that input file. The use cases you described in your comment look quite different from those described in your question and don't make sense (if the first number in `CE1(2-14;7-14)` must be 2 or 8 then what do you mean in general by `CE1(2-14;7-14)` vs `CE1[28]....`?) so you might want to update that too. – Ed Morton Feb 28 '16 at 19:53