I'm trying to nut out all _grokparsefailure's on my logstash box.
Seems the only two culprits are NGINX logs which trip up my NGINXACCESS pattern:
%{IPORHOST:clientip} %{NGUSER:ident} %{NGUSER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:"(?:%{URI:referrer}|-)"|%{QS:referrer}) %{QS:agent}
the following are two examples of message results that get tagged as grok fails.
172.31.0.2 - - [30/Jul/2015:15:10:49 +1000] "GET /web-app/[EXPAND] HTTP/1.1" 404 6432 "-" "Amazon CloudFront" "web-app.mydomain.com" "127.0.0.1"
172.31.0.2 - - [30/Jul/2015:14:13:52 +1000] "GET /web-app/show?wid=5540cfbc3asdf034ct=&domain=apptest.mydomain.com&ttl=\x5C%2230\x5C%22&filter_id=14026&unique_id=1 HTTP/1.1" 200 11400 "http://apptest.mydomain.com/"; "Amazon CloudFront" "apptest.mydomain.com" "127.0.0.1"
going through the grok debugger, the fail relates to %{URIPATHPARAM:request} hitting the brackets for [EXPAND] in the first example and the backslashes for the \x5C%2230\x5C%22 in the second. ie. if i remove [, ], or \ from the inputs then grok matches fine.
I can't seem to workout how to get the URIPATHPARAM grok filter to deal with those examples of brackets and backslash. Any ideas?