0

I'm trying to parse a standard diff of some sql files to return only the delete sections. I have been using grep with the after context (-A) which almost works (only because I know that delete sections will all be very short). e.g.

diff $$_$1.sql $$_$2.sql|egrep -A3 "[01234567889][01234567889]d[01234567889][0123456789]"

I am thinking that with AWK, I could tell it start at (the above regex) and stop at the first line starting with a digit or the first line ending with a --

I have played around a bit, but can't seem to find the right syntax to do this. Can this be done with AWK? or is there another tool I should use?

user9517
  • 115,471
  • 20
  • 215
  • 297
Robert
  • 133
  • 1
  • 7
  • 1
    Preferably an example of the `diff` output (or at least tell us what KIND of diff it is -- edit script, context diff, unified diff, etc.) – voretaq7 Aug 11 '11 at 16:31
  • 1
    In addition to @voretaq7's questions, it'd also be worth knowing if you need the result to be a valid patch file afterwards. – womble Aug 11 '11 at 18:08

3 Answers3

0

I am thinking that with AWK, I could tell it start at (the above regex) and stop at the first line starting with a digit or the first line ending with a --

Please give us an example if it is not what you want:

sed -n '/[0-9][0-9]d[0-9][0-9]/,/^[0-9]\|--$/p'

EDIT

Although you've accepted my answer but I still want to edit my post to share with you a regex that can help you solve your problem thoroughly. sed allows you excluding the matching lines with b - branch command:

sed -n '/[0-9][0-9]d[0-9][0-9]/,/^[0-9]\|--$/ { /^[0-9]/b; p }'

but with this regex, sed also remove the REGEX1. So, Lookahead appears in my mind:

sed -n '/[0-9][0-9]d[0-9][0-9]/,/^[0-9]\|--$/ { /^[0-9](?:(?![0-9]d[0-9][0-9]).*)$/b; p }'

but it not works because the sed, awk, grep uses the POSIX RE flavor which doesn't support negative lookahead. You should try with Python, Perl, Ruby, ...

quanta
  • 51,413
  • 19
  • 159
  • 217
0

I'd be inclined to try to do this with unified diff and a simple grep:

diff -u a.sql b.sql | grep -v '^\+' | rediff

The rediff is going to try and fix up the offsets after you've mangled the diffs... it won't work in all circumstances, but it's the best hope you've got of keeping a valid diff.

womble
  • 96,255
  • 29
  • 175
  • 230
0
diff ... | awk '/start-mark/ {flag = 1} /end-mark/ {flag = 0} flag'

Your regex could probably be simplified to be [0-9] (etc.)

The flag = 0 could be changed to exit if you only want to print the first matching range of lines.

Dennis Williamson
  • 62,149
  • 16
  • 116
  • 151