0

I have a logstash integration with kibana and accesslogs are published to kibana dashboard.

Now i have some logs and some patterns to recognize these logs. Pattern has some filters defined in it, now i see that for some logs the filters gets recognized. For some logs the filters dont come up. Locally when i test with grok debugger, the pattern looks fine. What could be the issue?

Log for which pattern doesnt match

2015-07-31 04:02:40 0.001 377 GET /ics 302 - "1.00572FZnxXkFo2n_GlCCyf0005yG0008PD;kYjE0ZDLIPGDj9ROnG" - "10.242.5.120"

Pattern:

ICSACCESSTIMESTAMPSTRING %{DATE}    %{TIME}
ICSWLS_ACCESS_LOG_FM1 %{ICSACCESSTIMESTAMPSTRING:icswlsaccess-logtimestamp} %{NUMBER:icswlsaccess-timetaken:float}  %{NUMBER:icswlsaccess-bytes:int}    %{DATA:icswlsaccess-csmethod}   %{DATA:icswlsaccess-csurl}  %{NUMBER:icswlsaccess-cstatus:int}  "%{DATA:icswlsaccess-dmsecid}" "%{DATA:icswlsaccess-ecidcontext}"   %{DATA:icswlsaccess-proxyremoteuser}    %{GREEDYDATA:icswlsaccess-proxyclientip}

ICSWLS_ACCESS_LOG_FM2 %{ICSACCESSTIMESTAMPSTRING:icswlsaccess-logtimestamp}  %{NUMBER:icswlsaccess-timetaken:float}   %{NUMBER:icswlsaccess-bytes:int} %{DATA:icswlsaccess-csmethod}    %{DATA:icswlsaccess-csurl}       %{NUMBER:icswlsaccess-cstatus:int}       "%{DATA:icswlsaccess-dmsecid}"   %{DATA:icswlsaccess-ecidcontext}       %{DATA:icswlsaccess-proxyremoteuser}     %{GREEDYDATA:icswlsaccess-proxyclientip}

ICSWLS_ACCESS_LOG_FM3 %{ICSACCESSTIMESTAMPSTRING:icswlsaccess-logtimestamp} %{NUMBER:icswlsaccess-timetaken:float}  %{NUMBER:icswlsaccess-bytes:int}    %{DATA:icswlsaccess-csmethod}   %{DATA:icswlsaccess-csurl}  %{NUMBER:icswlsaccess-cstatus:int}  "%{DATA:icswlsaccess-dmsecid}"  "%{DATA:icswlsaccess-ecidcontext}"  %{DATA:icswlsaccess-proxyremoteuser}    %{GREEDYDATA:icswlsaccess-proxyclientip}

ICSWLS_ACCESS_LOG_FM4 %{ICSACCESSTIMESTAMPSTRING:icswlsaccess-logtimestamp} %{NUMBER:icswlsaccess-timetaken:float}  %{NUMBER:icswlsaccess-bytes:int}    %{DATA:icswlsaccess-csmethod}   %{DATA:icswlsaccess-csurl}  %{NUMBER:icswlsaccess-cstatus:int}  "%{DATA:icswlsaccess-dmsecid}"  %{DATA:icswlsaccess-ecidcontext}    %{DATA:icswlsaccess-proxyremoteuser}    %{GREEDYDATA:icswlsaccess-proxyclientip}

ICSWLS_ACCESS_LOG_FM5 #%{GREEDYDATA:logcomments}

ICSWLS_ACCESS_LOG %{ICSWLS_ACCESS_LOG_FM1}|%{ICSWLS_ACCESS_LOG_FM2}|%{ICSWLS_ACCESS_LOG_FM3}|%{ICSWLS_ACCESS_LOG_FM4}|%{ICSWLS_ACCESS_LOG_FM5}

One more example which i tried: Sample message:

2015-08-12 13:20:48 0.002 377 GET /ics 302 - "1.0057HoLhIMPFo2n_GlCCyf0003TL000GHW;kYjE0ZDLIPGDj9ROnG" - "10.242.5.120"

Pattern:

ICSACCESSTIMESTAMPSTRING2 %{DATE} *%{TIME}

ICSWLS_ACCESS_LOG_FM6 %{ICSACCESSTIMESTAMPSTRING2:icswlsaccess-logtimestamp} *%{NUMBER:icswlsaccess-timetaken:float} *%{NUMBER:icswlsaccess-bytes:int} *%{DATA:icswlsaccess-csmethod} *%{DATA:icswlsaccess-csurl} *%{NUMBER:icswlsaccess-cstatus:int} *"%{DATA:icswlsaccess-dmsecid}" *"%{DATA:icswlsaccess-ecidcontext}" *%{DATA:icswlsaccess-proxyremoteuser} *%{GREEDYDATA:icswlsaccess-proxyclientip}
Jinu Mohan
  • 136
  • 12

2 Answers2

0

In all of your patterns, you've defined two spaces between timetaken and bytes; your input line only has one.

You might consider using " *" to match more than one space, which seems like it would simplify all of your patterns.

You can also use '"*' to make the quotes optional, which I think would combine everything into one pattern.

Finally, imagine what your regexp looks like after you've OR'ed together 5 complicated patterns. It can't be very efficient to run that across each line that comes in. Fortunately, you don't have to do that anymore.

Alain Collins
  • 16,268
  • 2
  • 32
  • 55
  • I tried with pattern that exactly matches, but it dint work. Also I tried with " +" and " *" , but both didn't work. – Jinu Mohan Aug 10 '15 at 11:01
0

I use %{SPACE} to cater for space in the GROK patterns and it works very well. For the example in question, below is a possible solution using %{SPACE}

filter {
   grok{
     match =>  { "message" => "%{TIMESTAMP_ISO8601:icswlsaccess-logtimestamp}%{SPACE}%{NUMBER:icswlsaccess-timetaken:float}%{SPACE}%{NUMBER:icswlsaccess-bytes:int}%{SPACE}%{WORD:icswlsaccess-csmethod}%{SPACE}%{DATA:icswlsaccess-csurl}%{SPACE}%{NUMBER:icswlsaccess-cstatus:int}%{SPACE}%{DATA:icswlsaccess-dmsecid}%{SPACE}\"%{DATA:icswlsaccess-ecidcontext}\"%{SPACE}%{DATA:icswlsaccess-proxyremoteuser}%{SPACE}\"%{GREEDYDATA:icswlsaccess-proxyclientip}\""} 

   }
   date {
      match => ["icswlsaccess-logtimestamp","yyyy-MM-dd HH:mm:ss"]
      timezone => "Europe/London"  # <----- use timezone that is applicable to you
      target => "@timestamp"
      remove_field => ["icswlsaccess-logtimestamp"]
    }

}

Sample output would be like

{
        "icswlsaccess-ecidcontext" => "1.0057HoLhIMPFo2n_GlCCyf0003TL000GHW;kYjE0ZDLIPGDj9ROnG",
           "icswlsaccess-csmethod" => "GET",
            "icswlsaccess-cstatus" => 302,
    "icswlsaccess-proxyremoteuser" => "-",
              "icswlsaccess-bytes" => 377,
              "icswlsaccess-csurl" => "/ics",
                      "@timestamp" => 2015-08-12T12:20:48.000Z,
          "icswlsaccess-timetaken" => 0.002,
            "icswlsaccess-dmsecid" => "-",
      "icswlsaccess-proxyclientip" => "10.242.5.120"
                         
}
karan shah
  • 370
  • 1
  • 4