0

Issue : I have a log file to parse with 84 columns of which 60 are optional. I got the Pattern working but if grok finds a single log line with a missing field it throws an error. In my case 99% of the logs have some field missing.Is there a way I can configure grok to ignore if a field does not have a value (or insert a dummy value or blank) and move to the next column.

There are 84 columns of which 60 are optional. I am trying to use grok to parse the file and was able to do it only if all 84 columns are specified.

ads 1.0 4572165a-c5b5-420b-851d-dc69d6d73673 20297cab-4b4c-4b55-b1a8-9ddc436a3f08 2014-02-24 23:55:14 953 1979 93215 106241 97170 58881 29926 10939 6852 34 36 3 URL.com/movie_player.php?pid=155&utm_source=ADK&utm_medium=CPC&utm_campaign=test4_pid155&utm_term=78434-2000241 8 3 1012 98.226.166.151 6042 5303 US IN 527 11 0 7075 7029 -6 11001 12008 1 11300 0 0 0 1 url.com/movie_player.php?pid=155&utm_source=adk&utm_medium=cpc&utm_campaign=test4_pid155&utm_term=78434-2000241 www.url.com url.com 11203 65792 0 live.test.com/swf/v4/manager.swf 345550 7.7 USD 0 0 0 0 0 0 25 0 0 60 0 0 0 0 0 0 1393286114 2 0

Rachit
  • 1
  • 2

1 Answers1

0

So this is what I am doing in order to get around the issue. :

Given : grok-logstash does not work well with TSV data : https://logstash.jira.com/browse/LOGSTASH-1550 Grok is fine with CSV

Workaround : Wrote a python script to convert a tsv to csv in the filters and then run it through the csv filter

Sample output :

This is what the rubydebug output looks like :

             "supply_sample" => "0",
            "diagnostic_code" => "60",
    "logging_diagnostic_code" => nil,
    "billable_cluster_pi_values" => nil,
    "effective_cluster_pi_values" => nil,

***Edit python script not needed this is what I am doing now

find -name "20140224-2310-10_126_94_215-21460.1.gz" | xargs zcat | sed 's/\t/,/g' | nc localhost 3333

Rachit
  • 1
  • 2