0

I want to parse a timestamp from logs to be used by loki as the timestamp.
Im a total noob when it comes to regex.

The log file is from "endlessh" which is essentially a tarpit/honeypit for ssh attackers.

It looks like this:

2022-04-03 14:37:25.101991388  2022-04-03T12:37:25.101Z CLOSE host=::ffff:218.92.0.192 port=21590 fd=4 time=20.015 bytes=26
2022-04-03 14:38:07.723962122  2022-04-03T12:38:07.723Z ACCEPT host=::ffff:218.92.0.192 port=64475 fd=4 n=1/4096

What I want to match, using regex, is the second timestamp present there, since its a utc timestamp and should be parseable by promtail.

I've tried different approaches, but just couldn't get it right at all.

So first of all I need a regex that matches the timestamp I want.
But secondly, I somehow need to form it into a regex that exposes the value in some sort? The docs offer this example:

.*level=(?P<level>[a-zA-Z]+).*ts=(?P<timestamp>[T\d-:.Z]*).*component=(?P<component>[a-zA-Z]+)

Afaik, those are named groups, and that is all that it takes to expose the value for me to use it in the config?

Would be nice if someone can provide a solution for the regex, and an explanation of what it does :)

Luc
  • 57
  • 1
  • 8
  • 1
    Perhaps capture the second timestamp in a named capture group `^\d{4}-\d{2}-\d{2} \d\d:\d\d:\d\d\.\d+\s+(?P\d{4}-\d{2}-\d{2}T\d\d:\d\d:\d\d\.\d+Z)\b` https://regex101.com/r/Hoc0TW/1 – The fourth bird Apr 03 '22 at 14:48
  • but wouldnt it be easier, to just look for the n-th sequence of characters? – Luc Apr 03 '22 at 15:17
  • 1
    Do you mean like this? `^(?:\S+\s+){2}(?\S+)` https://regex101.com/r/pPzR8m/1 – The fourth bird Apr 03 '22 at 15:19

1 Answers1

1

You could for example create a specific pattern to match the first part, and capture the second part:

^\d{4}-\d{2}-\d{2} \d\d:\d\d:\d\d\.\d+\s+(?P<timestamp>\d{4}-\d{2}-\d{2}T\d\d:\d\d:\d\d\.\d+Z)\b

Regex demo

Or use a very broad if the format is always the same, repeating an exact number of non whitespace characters parts and capture the part that you want to keep.

^(?:\S+\s+){2}(?<timestamp>\S+)

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70