10

I am parsing proxy logs with Logstash and its Grok filter. The logs contain quoted strings :

1438120705 [.....] "SEF-EDP8" - "C"
"/GPM/1023/5745-7/456V/"

With the Grok Debugger the following pattern works like a charm :

%{NUMBER:ts} [......] (-|"%{USERNAME:token1}") (-|%{DATA:token2}) (-|"%{WORD:token3}") (-|"%{DATA:token4}")

This does not work with Logstash's Grok because of the double quotes in the grok pattern. Logstash error log :

Error: Expected one of #, {, } at line 9, column 204 (byte 374) after
filter {
    grok {
        match => { "message" => "%{NUMBER:ts} [......] ("

So I use the QuotedString grok pattern instead :

%{NUMBER:ts} [......] (-|%{QS:token1}) (-|%{DATA:token2}) (-|%{QS:token3}) (-|%{QS:token4})

This works with the Grok Debugger as well, but quotes are extracted with quoted strings. It doesn't work with Logstash either :

token1 : ""SEF-EDP8"" token2 : null token3 : ""C"" token4 :
""/GPM/1023/5745-7/456V/""

How can I make it work with Logstash? How can I remove these unwanted extra double quotes?

baudsp
  • 4,076
  • 1
  • 17
  • 35
c-val
  • 181
  • 1
  • 2
  • 13
  • What happens if you just escape the quotes with backslashes? – fafl Feb 24 '16 at 10:47
  • 1
    Try `%{NUMBER:ts} \[[^\]]*] (-|"(%{DATA:token1})") (-|"(%{DATA:token2})") (-|"(%{DATA:token3})")( (-|"(%{DATA:token4})"))`. I have no more example input, so, not sure it will work with all of them. – Wiktor Stribiżew Feb 24 '16 at 10:48
  • @fafl it doesn't like it either – c-val Feb 24 '16 at 10:56
  • It does not need escaping, `QS` matches the quotes strings by design. You need `DATA` tokens. – Wiktor Stribiżew Feb 24 '16 at 10:59
  • @Wiktor Stribiżew : The parenthesis does not change logstash behavior : Error: Expected one of #, {, } at line 9, column 204 (byte 374) after filter { grok { match => { "message" => "%{NUMBER:ts} [......] (" – c-val Feb 24 '16 at 11:18

3 Answers3

11

Changing the outer double quotes to single quotes instead did the trick for me:

grok {
  match => { "message" => 'SOME "TEXT QUOTED"' }
}

Hope it helps.

SebaGra
  • 2,801
  • 2
  • 33
  • 43
10

If you escape " with backslash then it works fine.

%{NUMBER:ts} [......] (-|"%{USERNAME:token1}") (-|%{DATA:token2}) (-|"%{WORD:token3}") (-|"%{DATA:token4}")

Your new string will look like

%{NUMBER:ts} [......] (-|\"%{USERNAME:token1}\") (-|%{DATA:token2}) (-|\"%{WORD:token3}") (-|\"%{DATA:token4}\")

NileshP
  • 101
  • 1
  • 5
2

Try gsub after you have extracted the fields with quotes

filter {
  mutate {
    gsub => [

      "fieldname", "\"", ""
    ]
  }
}

https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-gsub

geekscrap
  • 965
  • 2
  • 12
  • 26