0

My data looks like this:

[ REPORT_PROFILE = Some text ] [ TIME_GENERATED = 1579734865 ] [ RECORD_NUMBER = 131757058 ]

My data might also contain [ SOME_KEY = Some value].

I'd like to extract:

| Key            | Value      |
|----------------|------------|
| SOME_KEY       | Some value |
| REPORT_PROFILE | Some text  |
| TIME_GENERATED | 1579734865 |
| RECORD_NUMBER  | 131757058  |

I could do this using multiple regexes e.g.

\[\s+REPORT_PROFILE = (?<REPORT_PROFILE>[^\]]+)\s+\]

\[\s+\TIME_GENERATED = (?<TIME_GENERATED>[^\]]+)\s+\]

But is there a way I can use a single regex to extract an arbitrary number of match groups, dynamically naming them based on a key name in the source text?

I'm using Splunk but it's just PCRE under the hood (not PCRE2, to clarify).

gf131072
  • 155
  • 2
  • 5

1 Answers1

0

((?:\[ [^\[\]]+ = [^\[\]]+ \])+)

Regex101 Breakdown

EDIT

This one will return named groups for the 3 known keys. The arbitrary key can't be named because this solution depends on "positive lookbehind" which must have fixed width.

((?:\[ [^\[\]]+ = ((?<=REPORT_PROFILE = )(?<REPORT_PROFILE>[^\[\]]+)|(?<=TIME_GENERATED = )(?<TIME_GENERATED>[^\[\]]+)|(?<=RECORD_NUMBER = )(?<RECORD_NUMBER>[^\[\]]+)|([^\[\]]+) ]))+)

Regex101/2 Breakdown

varontron
  • 1,120
  • 7
  • 21
  • That captures an arbitrary number of [ somekey = somevalue ] pairs as match groups, but captures the entire string as the match and doesn't name it. I'm hoping to capture the somevalue bit as the match group text, and name the match group somekey. – gf131072 Jan 23 '20 at 01:47