Grep a Log file for the last occurrence of a string between two strings

Question

I have a log file trace.log. In it I need to grep for the content contained within the strings <tag> and </tag>. There are multiple sets of this pair of strings, and I just need to return the content between last set (in other words, from the tail of the log file).

Extra Credit: Any way I can return the content contained within the two strings only if the content contains "testString"?

Thanks for looking.

EDIT: The search parameters and are contained on different lines with about 100 lines of content separating them. The content is what I'm after...

Examples of input might help; it's not clear whether the tags are on same line or on different ones. — devnull, Oct 30 '13 at 12:02
the tags are on different lines ..and we're looking at about 70-100 lines of content within the tags. — rs79, Oct 30 '13 at 12:12
Rather than putting this information in the comments, update your question. Apparently, the responses that you've received assume that the tags are on the same line. — devnull, Oct 30 '13 at 12:18

fedorqui · Accepted Answer · 2013-10-30T14:02:47.513

Use tac to print the file the other way round and then grep -m1 to just print one result. The look behind and look ahead checks text in between <tag> and </tag>.

tac a | grep -m1 -oP '(?<=tag>).*(?=</tag>)'

Test

Given this file

$ cat a
<tag> and </tag>
aaa <tag> and <b> other things </tag>
adsaad <tag>and  last one</tag>

$ tac a | grep -m1 -oP '(?<=tag>).*(?=</tag>)'
and  last one

Update

EDIT: The search parameters and are contained on different lines with about 100 lines of content separating them. The content is what I'm after...

Then it is a bit more tricky:

tac file | awk '/<\/tag>/ {p=1; split($0, a, "</tag>"); $0=a[1]};
                /<tag>/   {p=0; split($0, a, "<tag>");  $0=a[2]; print; exit};
                p' | tac

The idea is to reverse the file and use a flag p to check if the <tag> has appeared yet or not. It will start printing when </tag> appears and finished when <tag> comes (because we are reading the other way round).

split($0, a, "</tag>"); $0=a[1]; gets the data before </tag>
split($0, a, "<tag>" ); $0=a[2]; gets the data after <tag>

Test

Given a file a like this:

<tag> and </tag>
aaa <tag> and <b> other thing
come here
and here </tag>

some text<tag>tag is starting here
blabla
and ends here</tag>

The output will be:

$ tac a | awk '/<\/tag>/ {p=1; split($0, a, "</tag>"); $0=a[1]}; /<tag>/ {p=0; split($0, a, "<tag>"); $0=a[2]; print; exit}; p' | tac
tag is starting here
blabla
and ends here

score 26 · Answer 2 · answered Nov 12 '14 at 11:03

26

If like me, you don't have access to tac because your sysadmin won't play ball you can try:

grep pattern file | tail -1

answered Nov 12 '14 at 11:03

SlackGadget

487
6
9

Thanks, this should be right answer. because tac will mess with linenumbers in output, tail -1 did the trick – Bruno Rocha - rochacbruno Sep 21 '16 at 00:54
also- tac- not standard on all systems – koolunix Oct 07 '16 at 17:42
Note, however, that this does not take into account the _between two strings_ part. – fedorqui Nov 29 '16 at 07:39

pfnuesel · Answer 3 · 2013-10-30T17:23:10.103

Another solution than grep would be sed:

tac file | sed -n '0,/<tag>\(.*\)<\/tag>/s//\1/p'

tac file prints the file in the reverse order (cat backwards), then sed proceeds from input line 0 to the first occurence of <tag>.*<\tag>, and substitutes <tag>.*<\tag> with only the part that was inside <tag>. The p flag prints the output, which was suppressed by -n.

Edit: This does not work if <tag> and </tag> are on different lines. We can still use sed for that:

tac file | sed -n '/<\/tag>/,$p; /<tag>/q' | sed 's/.*<tag>//; s/<\/tag>.*//' | tac

Again we use tac to read the file backwards, then the first sed command reads from the first occurrence of and quits when it finds . Only the lines in between are printed. Then we pass it to another sed process to strip the 's and finally reverse the lines again with tac.

score 0 · Answer 4 · edited Jun 15 '15 at 14:56

0

perl -e '$/=undef; $f=<>; push @a,$1 while($f=~m#<tag>(.*?)</tag>#msg); print $a[-1]' ex.txt

Extra Credit: Any way I can return the content contained within the two strings only if the content contains "testString"?

perl -e '$/=undef; $f=<>; push @a,$1 while($f=~m#<tag>(.*?)</tag>#msg); print $a[-1] if ($a[-1]~=/teststring/);' ex.txt

edited Jun 15 '15 at 14:56

fedorqui

275,237
103
548
598

answered Oct 30 '13 at 12:11

Vorsprung

32,923
5
39
63

mpez0 · Answer 5 · 2013-10-30T12:42:12.087

A little untested awk that handles multiple lines:

awk '
    BEGIN    {retain="false"}
    /<\tag>/ {retain = retain + $0; keep="false"; next}
    /<tag>/  {keep = "true"; retain = $0; next}
    keep == "true" {retain = retain + $0}
    END {print retain}
' filename

We start just reading the file; when we hit the , we start keeping lines. When we hit the , we stop. If we hit another , we clear the retained string and start again. If you want all the strings, print at each

Grep a Log file for the last occurrence of a string between two strings

5 Answers5

Test

Update

Test