I'm trying to grab the directory paths of GET requests and count them in Splunk using this capturing regex.
index=main sourcetype="access_combined_wcookie" | rex "(?i)\"GET /(?P<MYDIR>\w+)/" | timechart count by MYDIR
This sort of works. It grabs the name of the top level directories and sums them up by time as expected, except that it also displays HEAD requests as "NULL" or "OTHER."
The regex works as expected in both perl and Python (ie, it doesn't match on a HEAD request.) Anyone have an idea what I have to do to make Splunk stop reporting the stuff that I didn't capture to begin with? This behavior is really counter-intuitive.