I am trying to parse our log files and send them to elasticsearch. The problem is that our S3 client is injecting lines into the file that contains carriage returns (\r) instead of new line chars (\n). The config for the File Input Filter using '\n' as the delimiter which is consistent with 99% of the data. When I run logstash against this data, it misses the last line which is what I am really looking for. This is because the File Input Filter is treating the '\r' characters as normal text and not new line. To get around this I am trying to use a Mutate Filter to rewrite the '\r' chars to '\n'. The mutate works, but Grok still sees it as one big line. and _grokparsefailure.
My 'normal' log file lines Grok as expected.
Config
input {
file {
path => "/home/pa_stg/runs/2015-12-09-cron-1449666001/run.log"
start_position => "beginning"
sincedb_path => "/data/logstash/sincedb"
stat_interval => 300
type => "spark"
}
}
filter{
mutate {
gsub => ["message", "\r", "
"]
}
grok {
match => {"message" => "\A%{DATE:date} %{TIME:time} %{LOGLEVEL:loglevel} %{SYSLOGPROG}%{GREEDYDATA:data}"}
break_on_match => false
}
}
output{
stdout { codec => rubydebug }
}
Input
This sample from the input file illustrates the problem. The ^M characters are how vim displays the '\r' Carriage Returns ('more' hides most of them). I left the line as is so you can see that the whole thing is seen in linux and the File Plugin as a single line of text.
^M[Stage 79:=======> (30 + 8) / 208]^M[Stage 79:============> (49 + 8) / 208]^M[Stage 79:=================> (65 + 8) / 208]^M[Stage 79:=====================> (83 + 8) / 208]^M[Stage 79:===========================> (105 + 8) / 208]^M[Stage 79:===============================> (122 + 8) / 208]^M[Stage 79:====================================> (142 + 8) / 208]^M[Stage 79:=========================================> (161 + 8) / 208]^M[Stage 79:==============================================> (180 + 6) / 208]^M[Stage 79:==================================================> (195 + 3) / 208]^M[Stage 79:=====================================================>(206 + 1) / 208]^M ^M^M[Stage 86:==============> (55 + 8) / 208]^M[Stage 86:===================> (75 + 8) / 208]^M[Stage 86:==========================> (101 + 8) / 208]^M[Stage 86:===============================> (123 + 8) / 208]^M[Stage 86:======================================> (147 + 8) / 208]^M[Stage 86:============================================> (173 + 6) / 208]^M[Stage 86:==================================================> (193 + 3) / 208]^M[Stage 86:=====================================================>(205 + 1) / 208]^M ^M^M[Stage 93:===================> (74 + 8) / 208]^M[Stage 93:===========================> (104 + 8) / 208]^M[Stage 93:==================================> (132 + 8) / 208]^M[Stage 93:========================================> (157 + 9) / 208]^M[Stage 93:================================================> (186 + 6) / 208]^M[Stage 93:=====================================================>(206 + 2) / 208]^M ^M15/12/09 13:03:46 INFO SomethingProcessor$: Something Processor completed
15/12/09 13:04:44 INFO CassandraConnector: Disconnected from Cassandra cluster: int
Output
{
"message" => "\n[Stage 79:=======> (30 + 8) / 208]\n[Stage 79:============>
(49 + 8) / 208]\n[Stage 79:=================> (65 + 8) / 208]\n[Stage 79:===
==================> (83 + 8) / 208]\n[Stage 79:===========================> (105 + 8
) / 208]\n[Stage 79:===============================> (122 + 8) / 208]\n[Stage 79:====================================>
(142 + 8) / 208]\n[Stage 79:=========================================> (161 + 8) / 208]\n[Stage 79:================
==============================> (180 + 6) / 208]\n[Stage 79:==================================================> (195 + 3) / 208]\n[St
age 79:=====================================================>(206 + 1) / 208]\n
\n\n[Stage 86:==============> (55 + 8) / 208]\n[Stage 86:===================>
(75 + 8) / 208]\n[Stage 86:==========================> (101 + 8) / 208]\n[Stage 86:====
===========================> (123 + 8) / 208]\n[Stage 86:======================================> (147 + 8)
/ 208]\n[Stage 86:============================================> (173 + 6) / 208]\n[Stage 86:========================================
==========> (193 + 3) / 208]\n[Stage 86:=====================================================>(205 + 1) / 208]\n
\n\n[Stage 93:===================> (74 + 8) / 208]\n[S
tage 93:===========================> (104 + 8) / 208]\n[Stage 93:==================================>
(132 + 8) / 208]\n[Stage 93:========================================> (157 + 9) / 208]\n[Stage 93:============================
====================> (186 + 6) / 208]\n[Stage 93:=====================================================>(206 + 2) / 208]\n
\n15/12/09 13:03:46 INFO SomethingProcessor$: Something Processor com
pleted",
"@version" => "1",
"@timestamp" => "2015-12-09T22:16:52.898Z",
"host" => "ip-10-252-1-225",
"path" => "/home/something/pa_stg/runs/2015-12-09-cron-1449666001/run.log",
"type" => "spark",
"tags" => [
[0] "_grokparsefailure"
]
}
I need grok to parse this line as it were a newline '\n'. Anyone know how to fix this?
15/12/09 13:03:46 INFO SomethingProcessor$: Something Processor completed