0

I am trying to parse the following web server log into certain fields

/BluePortServlets/LoadService/servicepath/test1/test2/test3?serviceId=4403&categoryId=1&t=0.13146932582447225

My pattern is the following

/%{WORD:PATH}/%{WORD:PATH}/%{WORD:PATH} .....

My problem is that since the size and the levels of the path are not fixed i want to have something like a Kleene Star format on top of the pattern. But somehow the Grok Debugger doesn't compile it. Like this i will be able to parse paths with unknown size.

Something like this

[%{WORD:PATH}/]*

The desired result would be

BluePortServlets
LoadService
servicepath
test1
..
testN

Thank you in advance

stratis
  • 738
  • 3
  • 8
  • 23
  • 1
    Like `%{WORD:HTTP} /%{WORD:PATH}(?:/%{WORD:PATH})*`? – Wiktor Stribiżew Sep 29 '15 at 12:36
  • Yes mate, i am trying to check what u sent me but it gives me null after the BluePortServlets. But its definitely something like that – stratis Sep 29 '15 at 12:46
  • I get `{ "HTTP": [ [ "GET" ] ], "PATH": [ [ "BluePortServlets", "servicepath" ] ] }` with my above suggestion. What results do you expect? – Wiktor Stribiżew Sep 29 '15 at 12:56
  • You will not be able to keep all the captures that way, you will only have the last captured text. It is how Oniguruma/PCRE and most other regexps work. You should know how many subparts there are in the URL beforehand, and/or use optional groups. Try `%{WORD:HTTP} /(?[^/?]*)(?:/(?[^/?]*))?(?:/(?[^/?]*))?(?:/(?[^/?]*))?(?:/(?[^/?]*))?`. This does not support query string though, no idea if you need it. – Wiktor Stribiżew Sep 29 '15 at 13:02
  • I need to have more than the last captured text, that is my problem. I wonder why the Kleene star cant do the job there. Because the very thing on the regular expressions is that you define a pattern regardless of the size. – stratis Sep 29 '15 at 14:02
  • It's not clear what the desired end result is. Given the URL path above, what part do you want to extract into a field? – Magnus Bäck Sep 29 '15 at 14:03
  • But capturing just does not work the way you expect. The answer then will sound as "No, you cannot use capturing to get an unspecified number of URL subparts". – Wiktor Stribiżew Sep 29 '15 at 14:07

0 Answers0