1

I'm putting nginx logs into logstash and the api information is sent via get unfortunately.

So there's 2 parts in logstash where API creditianals are stored. Here are examples

message: 10.120.40.105 - - [29/Jul/2015:16:41:09 +0000] "PUT /v1/resources/scenes/455IrIBcRsa0kkIs6mv9lQ?api_key=11111111111111111&api_secret=2222222222222222222222222 HTTP/1.1" 200 689 "-" "python-requests/2.6.0 CPython/2.7.9 Linux/2.6.32-504.30.3.el6.x86_64" "10.120.40.105" 0.180 0.180
request: /v1/resources/scenes/455IrIBcRsa0kkIs6mv9lQ?api_key=11111111111111111&api_secret=2222222222222222222222222

I'm dropping the request via

NGUSERNAME [a-zA-Z\.\@\-\+_%]+
NGUSER %{NGUSERNAME}
NGINXACCESS %{IPORHOST:clientip} %{NGUSER:ident} %{NGUSER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:"(?:%{URI:referrer}|-)"|%{QS:referrer}) %{QS:agent} %{QS:xforwardedfor} %{NUMBER:request_time} %{NUMBER:upstream_time}
NGINXACCESS %{IPORHOST:clientip} %{NGUSER:ident} %{NGUSER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:"(?:%{URI:referrer}|-)"|%{QS:referrer}) %{QS:agent} %{QS:xforwardedfor} %{NUMBER:request_time}

my inputs look like

    grok {
        match => { "message" => "%{NGINXACCESS}" }
        patterns_dir => ["/etc/logstash/patterns"]
    }
    date {
        match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
    }
    geoip {
        source => "clientip"
        target => "geoip"
        database => "/usr/share/GeoIP/GeoLiteCity.dat"
        add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
        add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
    }
    mutate {
        convert => [ "[geoip][coordinates]", "float"]
        convert => [ "request_time", "float"]
        convert => [ "upstream_time", "float"]
    }

Is there any mutate way to replace the anything after api_secret= with like "xxxxxxxxxxxx"

thanks!

Mike
  • 22,310
  • 7
  • 56
  • 79

1 Answers1

2

This is actually a little harder than it looks, since the gsub field for mutate doesn't actually do what you want. It seems to not be quite as smart as you'd think.

I had to modify the patterns you're using, to capture everything before and after the request(pre_req and post_req respectively) but it does seem possible.
No idea how well it will scale performance-wise since there's a LOT of filtering going on here, but it does work.

I tested it with this config:

input {
  stdin {}
}

filter {
  grok {
    match => [
      "message" , "(?<pre_req>%{IPORHOST:clientip} (?<ident>[a-zA-Z\.\@\-\+_%]+) (?<auth>[a-zA-Z\.\@\-\+_%]+) \[%{HTTPDATE:timestamp}\] \"%{WORD:verb} )%{URIPATHPARAM:request}(?<post_req> HTTP/%{NUMBER:httpversion}\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:\"(?:%{URI:referrer}|-)\"|%{QS:referrer}) %{QS:agent} %{QS:xforwardedfor} %{NUMBER:request_time} %{NUMBER:upstream_time})",
      "message" , "(?<pre_req>%{IPORHOST:clientip} (?<ident>[a-zA-Z\.\@\-\+_%]+) (?<auth>[a-zA-Z\.\@\-\+_%]+) \[%{HTTPDATE:timestamp}\] \"%{WORD:verb} )%{URIPATHPARAM:request}(?<post_req> HTTP/%{NUMBER:httpversion}\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) (?:\"(?:%{URI:referrer}|-)\"|%{QS:referrer}) %{QS:agent} %{QS:xforwardedfor} %{NUMBER:request_time})"
      ]
    break_on_match => true
  }
  grok {
    match => { "request" => "(?<request_path>[^?]*)?(?<request_params>.*)"
  }

  }
  mutate {
    gsub => [ "request_params" , "[?]", "" ]
  }
  kv {
    field_split => "&"
    source => "request_params"
    prefix => "request_params_"
  }
  mutate {
    replace => { "request" => "%{request_path}?api_key=%{request_params_api_key}&api_secret=XXXXXXXXXXXXXXXXXXXXXXXXXXXXX" }
    replace => { "message" => "%{pre_req}%{request}%{post_req}" }
    remove_field => [ "request_path", "request_params", "request_params_api_key", "request_params_api_secret", "pre_req", "post_req" ]
  }
}

output {
  stdout { codec => rubydebug }
}

And it seems to have done exactly what you want..

# /opt/logstash/bin/logstash -f config.conf
Logstash startup completed
10.120.40.105 - - [29/Jul/2015:16:41:09 +0000] "PUT /v1/resources/scenes/455IrIBcRsa0kkIs6mv9lQ?api_key=11111111111111111&api_secret=2222222222222222222222222 HTTP/1.1" 200 689 "-" "python-requests/2.6.0 CPython/2.7.9 Linux/2.6.32-504.30.3.el6.x86_64" "10.120.40.105" 0.180 0.180
{
          "message" => "10.120.40.105 - - [29/Jul/2015:16:41:09 +0000] \"PUT /v1/resources/scenes/455IrIBcRsa0kkIs6mv9lQ?api_key=11111111111111111&api_secret=XXXXXXXXXXXXXXXXXXXXXXXXXXXXX HTTP/1.1\" 200 689 \"-\" \"python-requests/2.6.0 CPython/2.7.9 Linux/2.6.32-504.30.3.el6.x86_64\" \"10.120.40.105\" 0.180 0.180",
         "@version" => "1",
       "@timestamp" => "2015-07-29T19:21:14.678Z",
             "host" => "elk.example.com",
         "clientip" => "10.120.40.105",
            "ident" => "-",
             "auth" => "-",
        "timestamp" => "29/Jul/2015:16:41:09 +0000",
             "verb" => "PUT",
          "request" => "/v1/resources/scenes/455IrIBcRsa0kkIs6mv9lQ?api_key=11111111111111111&api_secret=XXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
      "httpversion" => "1.1",
         "response" => "200",
            "bytes" => "689",
            "agent" => "\"python-requests/2.6.0 CPython/2.7.9 Linux/2.6.32-504.30.3.el6.x86_64\"",
    "xforwardedfor" => "\"10.120.40.105\"",
     "request_time" => "0.180",
    "upstream_time" => "0.180"
}
GregL
  • 9,370
  • 2
  • 25
  • 36