1

I'm quite new to ELK and Grok-filtering, and I'm struggling with parsing this particular pattern in my grok filter.

I've used the grok debugger to try and solve this, but although I like the tool, I just get confused by the custom patterns.

Eventually, I hope to parse lots of log files sent by filebeat to logstash, then send the parsed logs to elasticsearch and display with kibana or some similar visualization tool.

The lines that I need to parse follow the following pattern:

1310 2017-01-01 16:48:54 [325:51] [326:49] [359:57] Some log info text
  • The first four digits is a log type identifier, and will be used for grouping. I've called the field "LogLineID".
  • The date is formatted YYYY-MM-DD HH:MM:SS, and is parsed ok. I called the field "LogDate".
  • But now the problem begins. Within the square brackets, I have counters, formatted as MM:SS if you like. I cannot for the life of me find a way to sort these out, but I need to compare these times, hence I want to store them as minutes and seconds, not just numbers.
    • The first is a counter "TimeSpent",
    • the second is a counter "TimeStarted" and
    • the third is a counter "TimeSinceDown".
  • Then, last, comes the info text, which I've managed to grok with simply applying %{GREEDYDATA:LogInfo}.

I notice that the amount of minutes could be far higher than the standard 60 minutes within an hour, so I may be barking up the wrong tree here trying to parse it with date patterns such as TIMESTAMP_ISO8601, but then, I don't really know how else to do this.

So, I came this far:

%{NUMBER:LogLineID} %{TIMESTAMP_ISO8601:LogDate}

and were as mentioned able to (by cutting away the square bracket parts) to parse the log info text with

%{GREEDYDATA:LogInfo}

to create a field LogInfo.

But that's were I'm stuck. Could someone please help me figure out the rest?

Massive thanks in advance.

PS! I also found %{NUMBER:duration}, but it could as far as I could tell only parse timestamps with dot, not colon..

Vandalf
  • 103
  • 1
  • 9

2 Answers2

0

grok regex expression can help you solve the problem.

but first I wanna make sure that do you mean [325:51] [326:49] [359:57] are the three component that you wanna to fetch? And it will returns the result like :

TimeSpent: 325:51
TimeStarted: 326:49
TimeSinceDown: 359:57

were i get the point , you can use my ways in on of the following suggestions:

  1. define your own custom pattern files and add the pattern in your file.
  2. just use the expression in filter part of logstash conf file

hope it will helps you

Lin Don
  • 16
  • 1
0

Ah, there was a space.. Actually, I was misleading myself and everybody in my question, as it was not actually that log line that was causing problems. I just took the first one, not realizing where the problem really were, but the one causing problems had a space within the brackets as such: [ 42:31]. There are also some parts where there are two spaces, so the way I managed to solve this was to include a %{SPACE} between the \[ and the %{NUMBER}:

%{NUMBER:LogLineID} %{TIMESTAMP_ISO8601:LogDate} \[%{SPACE}%{NUMBER:TimeSpentMinutes}\:%{NUMBER:TimeSpentSeconds}\] \[%{SPACE}%{NUMBER:TimeStartedMinutes}\:%{NUMBER:TimeStartedSeconds}\] \[%{SPACE}%{NUMBER:TimeSinceDownMinutes}\:%{NUMBER:TimeSinceDownSeconds}\] %{GREEDYDATA:LogText}

I still haven't solved the merging of minutes and seconds, but this I can also handle in a later stage.

Thanks to Lin Don for showing an interest in my problem, and sorry for not replying sooner.

Hope the solution will help others (or even myself) if their stuck on the same kind of problem.

Note to myself: Read the logs more carefully before grok'ing.. :)

Vandalf
  • 103
  • 1
  • 9