3

So, the AWS Cloudfront WAF logs get sent to AWS Cloud Insights. How can I search the random placement of the key / value pairs for the httpRequest array?

Example log looks like this:

httpRequest.headers.0.name  host
httpRequest.headers.0.value www.somedomain.com
httpRequest.headers.1.name  cache-control
httpRequest.headers.1.value no-cache
httpRequest.headers.2.name  pragma
httpRequest.headers.2.value no-cache
httpRequest.headers.3.name  accept
httpRequest.headers.3.value */*
httpRequest.headers.4.name  accept-encoding
httpRequest.headers.4.value gzip, deflate
httpRequest.headers.5.name  from
httpRequest.headers.5.value bingbot(at)microsoft.com
httpRequest.headers.6.name  user-agent
httpRequest.headers.6.value Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)

So, a JSON array with 2 hashes. The order in that array is random. Sometimes user-agent will be in 1 or 3 or X. How can I search the value of the "value" field that corresponds to the value of the "name" field for "user-agent" ? ie: I want to search for "bingbot" but have it be specific to matching the "user-agent". I know I can just do a filter on @message for bingbot, but that just seems expensive and not specific / prone to false hits.

1 Answers1

6

Okay, so I think the "easiest" way is to treat @message as a string and write your own parse rule, pull the value you want into your own column via a regex and then you can search / do whatever on that.

If anyone has a better idea I'm all ears.

fields @timestamp, @message
| parse @message /(?i)"name":"user-agent","value":"(?<httpRequestUserAgent>[^"]+)/
| filter action == "BLOCK"
| stats count() as httpRequestUserAgentCount by httpRequestUserAgent 
| sort by httpRequestUserAgentCount desc 

The (?i) marks it as case insensitive.

  • This is awesome. Worth noting that the `httpRequestUserAgent` string that comes back comes through with the casing preserved. Which is good! – Grey Vugrin Oct 05 '22 at 18:27