3

Following is my Nginx log format

log_format timed_combined '$http_x_forwarded_for - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent" ' '$request_time $upstream_response_time $pipe';

Following is Nginx log entry(for reference)

- - test.user [26/May/2017:21:54:26 +0000] "POST /elasticsearch/_msearch HTTP/1.1" 200 263 "https://myserver.com/app/kibana" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 0.020 0.008 .

Following is the logstash grok pattern

NGUSERNAME [a-zA-Z\.\@\-\+_%]+
NGUSER %{NGUSERNAME}
NGINXACCESS %{IPORHOST:clientip} - - \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NUMBER:request_time} %{NUMBER:upstream_time}

Error found in logstash log

"status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"26/May/2017:19:28:14 -0400\" is malformed at \"/May/2017:19:28:14 -0400\"

Issue: - Nginx logs are not getting grokked. 
Requirement: - Timestamp should be filtered into a particular field.

What's wrong in my configuration? How to fix this error?

Ashik Mohammed
  • 979
  • 1
  • 7
  • 9

2 Answers2

2

Here is the pattern for NGINX access.log and error.log files.

filter {

############################# NGINX ##############################
  if [event][module] == "nginx" {

########## access.log ##########
    if [fileset][name] == "access" {
      grok {
        match => { "message" => ["%{IPORHOST:ip} - %{DATA:user_name} \[%{HTTPDATE:time}\] \"%{WORD:http_method} %{DATA:url} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:body_sent_bytes} \"%{DATA:referrer}\" \"%{DATA:agent}\""] }
        remove_field => "message"
      }
      date {
        match => ["time", "dd/MMM/YYYY:HH:mm:ss Z"]
        target => "@timestamp"
        remove_field => "time"
      }
      useragent {
        source => "agent"
        target => "user_agent"
        remove_field => "agent"
      }
      geoip {
        source => "ip"
        target => "geoip"
      }
    }

########## error.log ##########
    else if [fileset][name] == "error" {
      grok {
        match => { "message" => ["%{DATA:time} \[%{DATA:log_level}\] %{NUMBER:pid}#%{NUMBER:tid}: (\*%{NUMBER:connection_id} )?%{GREEDYDATA:messageTmp}"] }
        remove_field => "message"
      }
      date {
        match => ["time", "YYYY/MM/dd HH:mm:ss"]
        target => "@timestamp"
        remove_field => "time"
      }

      mutate {
        rename => {"messageTmp" => "message"}
      }
    }

    grok {
      remove_field => "[event]"
    }

    mutate {
      add_field => {"serviceName" => "nginx"}
    }
  }
}

Also for Tomcat: https://gist.github.com/petrov9/4740c61459a5dcedcef2f27c7c2900fd

double-beep
  • 5,031
  • 17
  • 33
  • 41
Anton
  • 604
  • 2
  • 11
  • 22
1

The log line you provided does not match the default NGINXACCESS grok pattern because of two differences:

  1. As the first element in the log line an ip address or hostname is expected, but in your log line a dash (-) is the first element.
  2. The third element in your log line is a username, but the grok pattern expects a dash (-)

So there are two way to resolve this:

  1. Make sure, your log lines match the default pattern
  2. Change the grok pattern to something like this:

NGINXACCESS - - %{USERNAME:username} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NUMBER:request_time} %{NUMBER:upstream_time}

I suggest to use Grok debugger to develop and debug grok patterns. It allows you to create and test them incrementally.

breml
  • 21
  • 3
  • I have modified the grok patterns by using Grok debugger. Now, grok filter is matching to the log. But, still getting the same error. **caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"29/May/2017:22:53:37 -0400\" is malformed at \"/May/2017:22:53:37 -0400\"** @breml – Ashik Mohammed May 30 '17 at 03:01
  • I do not think this error is caused by the grok pattern but an other part within your Logstash configuration. So without further details, it will not be possible to answer your question regarding to above mentioned error. – breml May 30 '17 at 05:37
  • I tried the same configurations in a fresh test server. It's working fine. So, I think the elastic search reindexing will work in this case. These conflicts started occurring after I modified the Nginx log patterns( added parameter for original client IP and added response time). Do you think this might be a cause for this error? – Ashik Mohammed May 30 '17 at 07:14