1

I'm trying to automate some time consuming tasks/log checking, building a system that I will replicate to other uses.

I have a logfile for example:

...multiline ACTION Text where all is good...
ERR-101 Something is wrong
ERR-201 Something is wrong with QASDASDASD
INFO-524 Something was wrong 
WARN-484 Check line 23
...multiline ACTION Text where all is good...
ERR-101 Something is wrong
ERR-201 Something is wrong with PPOYOYOY
INFO-524 Something was wrong
WARN-484 Check line 23
INFO-524 This is it

I'm creating a check-error.template file:

# This is the template file
ERR-101 Something is wrong
ERR-201 Something is wrong with <TEXT_VAR>
INFO-524 Something was wrong
WARN-484 Check line <NUMBER_VAR>
<?>INFO-524 This is it</?>

Starting with # is a comment, surrounded by <?> are optional (e.g. exist only in the last line, match paragraphs with and without it).
Text and number will be regexp checked.

If the error matches the template, I know it's ignorable and I want to log it to the side and remove it from the log.
I'm not using something advanced (perl, other regexp helpers), as it will be an issue to make sure it exists on every environment, and currently trying to do it with grep -P.

The following function gets a file and converts the template to regexp pattern

 function template2variable {  
     local file=$1
     local var_name=$2
     local template=$(sed '/^#/d' "$file")
     local pattern="${template//\\/\\\\}" # replace \ with \\   
       pattern="${pattern//\"/\\\"}" # escape "   
       pattern="${pattern//<TEXT_VALUE>/([[:alnum:]_]+)}"   
       pattern="${pattern//<NUMBER_VALUE>/([[:digit:]]+)}"
           pattern="${pattern//$'\n'/\\n}"
           pattern="${pattern//<?>/(}"
           pattern="${pattern//<\/?>/)?}"
           printf -v "$var_name" '%s' "$pattern" 
} 

template2variable "check-error.template" $error_template

Matching template with:

grep -Pzo "${error_template}" $logfile

Doing so, I get back all the template lines I wished.

However, when trying to work with the grep data
using -n lists every iteration with 1
using -c I get line count of 1
using -v results in an empty output

It seems like the match has returned as one giant result instead of several I can iterate over.

What am I doing wrong?
Suggestions for improvement?

**Summary:
I want to define a "template", a paragraph of text to be matched inside another text file (logfile). The match will occur if I can find the whole chunk of the paragraph. The template / match, have some of the lines specify a placeholder for a text/number (e.g. "Line <NUMBER_VAR>" matching "Line 1", "Line 2", etc.) Doing so using bash/grep, I've defined a regexp template.

How can I iterate over the results? and create a new logfile without them?**

Thank you

Adding a minified example:

Logfile:

    Line 1
    Dave is right
    Sharon is great
    Line 2
    Dave is right
    Aharon is nice

If the template file is:

    # Template file
    Line <NUMBER_VAR>
    Dave is right

It will read the template, and will search for it inside the logfile. So I will could iterate over the options and get:

    match[0]
        Line 1
        Dave is right
    match[1]
        Line 2
        Dave is right
WhenYouDev
  • 11
  • 3
  • Dumb down your question and provide input and output examples... I read this twice and still don't know why or what is going on here – Mike Q Mar 10 '23 at 00:01
  • @MikeQ I've rephrased and summary and added a shorter example. – WhenYouDev Mar 10 '23 at 00:36
  • Having sample input and expected output from that same input will definitely help clarify to us readers what you are trying to achieve. Also, you'll do well to read, review and take to heart the items on this page : https://stackoverflow.com/tags/bash/info . Skip the Version information at the top and search for the sections labeled "Before asking about Problematic code" and "How to turn a bad script into a good question" . Good luck. Do you know about `sed` and `awk`. They are designed from processing text file data and have many features. Good luck. – shellter Mar 10 '23 at 00:42
  • pluse-uno for improving your question. (still a bit long, but better sample data is key!) Can "we" use anything to solve this problem or are you committed to this approach (for whatever reason). Good luck. – shellter Mar 10 '23 at 00:45
  • @shellter I'm familiar with awk and sed, but didn't find and option to utilize them with the regexp. the function provided shows my suggested progress to convert a text to a regexp but is not a requirement. I went over the bash info, thank you. The option to only have a start/end to define a paragraph ignores the inside of it (if for e.g. I have different line numbers to the same text/error I want to match). I want something that won't require installing new code. grep -p, awk, sed exist on all. – WhenYouDev Mar 10 '23 at 01:09
  • What is producing the `match [0[` etc output? Looks like debugging info, Do you need that in the final output? (Or haven't I read your question carefully enough (-;?) . – shellter Mar 10 '23 at 01:16
  • "_it will be an issue to make sure it exists on every environment_": I would say that `perl` is more likely to be available than `grep -P` – Fravadona Mar 10 '23 at 08:58
  • can `` include newlines? – Fravadona Mar 10 '23 at 09:27

1 Answers1

0

Simple fix to your script. You need to insert missing semi-colons, by replacing all instances of

pattern="${pattern

by

pattern=";${pattern

That will set the OR function for the match/action. Also, I don't completely understand what you are doing, but the line

local pattern="${template//\\/\\\\}"

seems like there is something wrong with that. It seems like it is either incomplete or malformed. I could be wrong.

Eric Marceau
  • 1,601
  • 1
  • 8
  • 11
  • local pattern="${template//\\/\\\\}" Makes sure that if I have any \ in the template file, they will be converted to \\ in the regexp match. I want to match the entire paragraph (which happens), I don't need OR between them. The issue is iterating. – WhenYouDev Mar 10 '23 at 01:43
  • @WhenYouDev, there is no iterating in your above script. Only one instance of printf for outputing the ${pattern} value. – Eric Marceau Mar 10 '23 at 02:00
  • I'm asking how to iterate over grep -Pzo "${error_template}" $logfile – WhenYouDev Mar 10 '23 at 02:25
  • **-z** refers to handling of the input stream, not a pattern file. For use of a multiline pattern file, you need to specify **-f ${filename}**. – Eric Marceau Mar 10 '23 at 02:34