Continuing a awk or sed print including a keyword until an end pattern is reached

Question

I have a large amount of long irregular logs that look like this:

###<date> errortext <errorcode-xxxxx> 
errortext 
errortext 
errortext 
errortext
###<date> errortext <errorcode-yyyy>
errortext 
errortext 
###<date> errortext <errorcode-<zzzzzzz>
errortext 
errortext 
errortext 
errortext 
errortext 
errortext 
errortext

etc

The length is irregular, and errors with the same error codes need to be found using grep/awk/sed or similar methods.

I need to split these documents by error code, printing all errors of one code into one document.

When I try to find a whole error code segment with a line like:

sed -n '/#</{:start /###/!{N;b start};/<errorcode-024332>/p}' file

The problem with lines like the above is that it will only print the line that includes the "errorcode-024332" and not all the errorcode until the next segment start(with the delimiter "###" in this case).

How do I achieve this?

https://stackoverflow.com/questions/38972736/how-to-select-lines-between-two-patterns might help, for ex: `awk '/errorcode-024332/{f=1; print; next} /^###/{f=0} f' file` will get you `errorcode-024332` section — Sundeep, Feb 21 '17 at 14:48

Aaron · Accepted Answer · 2017-02-22T10:03:18.060

2

Your problem happens because both #< and ### match the "header" line, so you only print it and never loop. You also appended to the pattern buffer rather than consuming the lines one by one, so the header would always have been matched anyway.

Assuming you want to display the "header" and "errortext" of the "errorcode-024332", here's how I would do it :

sed -n '/#<.*<errorcode-024332>/{:start p;n;/###/!{b start}}'

when we match the header line corresponding to our error code
we print it
we get the next line
if the next line doesn't contain ###, we go back to step 2.

A quick test I did with your sample data :

$ echo "###<date> errortext <errorcode-xxxxx>
errortext
errortext
[...]
errortext
errortext " | sed -n '/#<.*<errorcode-yyyy>/{:start p;n;/###/!{b start}}'

###<date> errortext <errorcode-yyyy>
errortext
errortext

edited Feb 22 '17 at 10:03

answered Feb 21 '17 at 14:32

Aaron

24,009
2
33
57

Adding my keyword as such: sed -n '/#{:start N;/###/!{b start};//p}' file gave me the same result as my old command. Did I misinterpret where to put this? – Flowdorio Feb 21 '17 at 14:36
1

@Flowdorio I've edited it, please tell me if it answers your question. – Aaron Feb 21 '17 at 14:44
It does! Thank you! – Flowdorio Feb 21 '17 at 14:53

hek2mgl · Answer 2 · 2017-02-21T14:52:54.257

You can use awk, like this:

awk -F'[<>-]' '/^#/{f=$(NF-1)}{print >> f; close(f)}' file.log

Let me explain it as a multiline version:

# Using this set of field delimiters it is simple to access
# the error code in the previous last field
BEGIN { FS="[<>-]"}

# On lines which start with a '#'
/^#/ {
    # We set the output (f)ilename to the error code
    f=$(NF-1)
}

# On all lines ...
{
    # ... append current line to (f)ilename
    print >> f;

    # Make sure to close the file to avoid running out of
    # file descriptors in case there are many different error
    # codes. If you are not concerned about that, you may
    # comment out this line.
    close(f)
}

Continuing a awk or sed print including a keyword until an end pattern is reached

2 Answers2