I have following filter which achieves most of my needs:
filter {grok {
match => { "message" => [ "%{IPORHOST:clientip} - %{NGUSER:user} \[%{HTTPDATE:timestamp}\] (?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest}) %{NUMBER:response} (?:%{NUMBER:bytes}|-) (-|(%{DATA:referrer})) ] }
However, some (not all) logs I am parsing in contain the name of the channel a user is using on my Apache server.
A normal log including the word "channel" would be like this:
10.40.80.11 - alex@example.com [03/Jan/2014:13:08:21 +0000] "GET /cgi-bin/feed/epg?channel=Bloomberg%20English&date=2016-01-03 HTTP/1.1" 200 368 "http://example.net/cgi-bin/feed/epg" "Mozilla/5.0"
The field "rawrequest" is saved on a separate field like this:
"GET /cgi-bin/feed/epg?channel=Bloomberg%20English&date=2016-04-04 HTTP/1.1"
Question: How can I save the names of the channels on a separate field considering not all logs contain the word channel in the field "rawrequest"?.
I have seen lots of examples but nothing similar.The character separating the channel to the rest of the string is "&". I would appreciate any help.
Solution:
match => { "request" => [ "channel=(?<Channels>[^&]+)" ] }