1

I am trying to parse our log files and send them to elasticsearch. The problem is that our S3 client is injecting lines into the file that contains carriage returns (\r) instead of new line chars (\n). The config for the File Input Filter using '\n' as the delimiter which is consistent with 99% of the data. When I run logstash against this data, it misses the last line which is what I am really looking for. This is because the File Input Filter is treating the '\r' characters as normal text and not new line. To get around this I am trying to use a Mutate Filter to rewrite the '\r' chars to '\n'. The mutate works, but Grok still sees it as one big line. and _grokparsefailure.

My 'normal' log file lines Grok as expected.

Config

input {
     file {
             path => "/home/pa_stg/runs/2015-12-09-cron-1449666001/run.log"
             start_position => "beginning"
             sincedb_path => "/data/logstash/sincedb"
             stat_interval => 300
             type => "spark"
     }
}
filter{
     mutate {
             gsub => ["message", "\r", "
"]
     }
     grok {
             match => {"message" => "\A%{DATE:date} %{TIME:time} %{LOGLEVEL:loglevel} %{SYSLOGPROG}%{GREEDYDATA:data}"}
             break_on_match => false
     }
}
output{
     stdout { codec => rubydebug }
}

Input

This sample from the input file illustrates the problem. The ^M characters are how vim displays the '\r' Carriage Returns ('more' hides most of them). I left the line as is so you can see that the whole thing is seen in linux and the File Plugin as a single line of text.

^M[Stage 79:=======>                                               (30 + 8) / 208]^M[Stage 79:============>                                          (49 + 8) / 208]^M[Stage 79:=================>                                     (65 + 8) / 208]^M[Stage 79:=====================>                                 (83 + 8) / 208]^M[Stage 79:===========================>                          (105 + 8) / 208]^M[Stage 79:===============================>                      (122 + 8) / 208]^M[Stage 79:====================================>                 (142 + 8) / 208]^M[Stage 79:=========================================>            (161 + 8) / 208]^M[Stage 79:==============================================>       (180 + 6) / 208]^M[Stage 79:==================================================>   (195 + 3) / 208]^M[Stage 79:=====================================================>(206 + 1) / 208]^M                                                                                ^M^M[Stage 86:==============>                                        (55 + 8) / 208]^M[Stage 86:===================>                                   (75 + 8) / 208]^M[Stage 86:==========================>                           (101 + 8) / 208]^M[Stage 86:===============================>                      (123 + 8) / 208]^M[Stage 86:======================================>               (147 + 8) / 208]^M[Stage 86:============================================>         (173 + 6) / 208]^M[Stage 86:==================================================>   (193 + 3) / 208]^M[Stage 86:=====================================================>(205 + 1) / 208]^M                                                                                ^M^M[Stage 93:===================>                                   (74 + 8) / 208]^M[Stage 93:===========================>                          (104 + 8) / 208]^M[Stage 93:==================================>                   (132 + 8) / 208]^M[Stage 93:========================================>             (157 + 9) / 208]^M[Stage 93:================================================>     (186 + 6) / 208]^M[Stage 93:=====================================================>(206 + 2) / 208]^M                                                                                ^M15/12/09 13:03:46 INFO SomethingProcessor$: Something Processor completed
15/12/09 13:04:44 INFO CassandraConnector: Disconnected from Cassandra cluster: int

Output

{
       "message" => "\n[Stage 79:=======>                                               (30 + 8) / 208]\n[Stage 79:============>
                             (49 + 8) / 208]\n[Stage 79:=================>                                     (65 + 8) / 208]\n[Stage 79:===
==================>                                 (83 + 8) / 208]\n[Stage 79:===========================>                          (105 + 8
) / 208]\n[Stage 79:===============================>                      (122 + 8) / 208]\n[Stage 79:====================================>
               (142 + 8) / 208]\n[Stage 79:=========================================>            (161 + 8) / 208]\n[Stage 79:================
==============================>       (180 + 6) / 208]\n[Stage 79:==================================================>   (195 + 3) / 208]\n[St
age 79:=====================================================>(206 + 1) / 208]\n
                  \n\n[Stage 86:==============>                                        (55 + 8) / 208]\n[Stage 86:===================>
                            (75 + 8) / 208]\n[Stage 86:==========================>                           (101 + 8) / 208]\n[Stage 86:====
===========================>                      (123 + 8) / 208]\n[Stage 86:======================================>               (147 + 8)
 / 208]\n[Stage 86:============================================>         (173 + 6) / 208]\n[Stage 86:========================================
==========>   (193 + 3) / 208]\n[Stage 86:=====================================================>(205 + 1) / 208]\n
                                                     \n\n[Stage 93:===================>                                   (74 + 8) / 208]\n[S
tage 93:===========================>                          (104 + 8) / 208]\n[Stage 93:==================================>
   (132 + 8) / 208]\n[Stage 93:========================================>             (157 + 9) / 208]\n[Stage 93:============================
====================>     (186 + 6) / 208]\n[Stage 93:=====================================================>(206 + 2) / 208]\n
                                                                 \n15/12/09 13:03:46 INFO SomethingProcessor$: Something Processor com
pleted",
      "@version" => "1",
    "@timestamp" => "2015-12-09T22:16:52.898Z",
          "host" => "ip-10-252-1-225",
          "path" => "/home/something/pa_stg/runs/2015-12-09-cron-1449666001/run.log",
          "type" => "spark",
          "tags" => [
        [0] "_grokparsefailure"
    ]
}

I need grok to parse this line as it were a newline '\n'. Anyone know how to fix this?

15/12/09 13:03:46 INFO SomethingProcessor$: Something Processor completed
Jeremiah Adams
  • 488
  • 1
  • 8
  • 19
  • Your input doesn't match your output. Input has "142 + 8" and output starts with "49 + 8". ? – Alain Collins Dec 09 '15 at 22:54
  • Thanks @AlainCollins. Good eyes. It seems that the linux 'more' command wasn't displaying the entire line. I assume this is due to the carriage returns. I've updated the original question with output from vim which has the entire line in question. – Jeremiah Adams Dec 10 '15 at 15:11
  • If it's coming in as one line (and you can't change whatever's writing the file), then the 'split' filter might work. It creates multiple events from one document. – Alain Collins Dec 10 '15 at 18:55

1 Answers1

1

I believe what you are looking for might be the multiline filter.

https://www.elastic.co/guide/en/logstash/current/plugins-filters-multiline.html

If I recall correctly, this filter is responsible for deciding if a log line is a new line, or not. For example, I am using it to concatenate all lines together, that do not start with "[INFO]".

    multiline {
            pattern => "^\[%{LOGLEVEL}\]"
            negate => true
            what => "previous"
    }

I hope that helps

pandaadb
  • 6,306
  • 2
  • 22
  • 41