Logstash un-gzip array log configuration

Question

everyone! I have logstash config, which forwards logs from RabbitMQ to elasticSearch. Something like this:

input {
    rabbitmq {
        ...
    }
}

filter {
    if [type] == "rabbitmq" {
        json {
            source => "message"
            target => "message"
        }
    }
}

output {
  elasticsearch {
    hosts => ["${ES_HOST}"]
    user => "${ES_USERNAME}"
    password => "${ES_PASSWORD}"
    sniffing => false
    index => "kit_events-%{[message][elasticsearch][index]}"
  }
}

And we were forced to compress logs on a fly, because they are spending too much traffic. Logs were moved into array and gzipped. What is the correct way of configuring un-gzipping and splitting array back into objects?

I did some research and found out that there is gzip_lines plugin and something on Ruby(?) to parse array, but I failed to implement it. Did anyone make something like this before?

UPD:

Added this filter

filter {

  if [type] == "kitlog-rabbitmq" {
    ruby {
      init => "
        require 'zlib'
        require 'stringio'
      "
      code => "
        body = event.get('[http][response][body]').to_s
        sio = StringIO.new(body)
        gz = Zlib::GzipReader.new(sio)
        result = gz.read.to_s
        event.set('[http][response][body]', result)
      "
    }
  }
}

And now catching an error

ERROR][logstash.filters.ruby    ] Ruby exception occurred: not in gzip format
[DEBUG][logstash.pipeline        ] output received {"event"=>{"@timestamp"=>2018-11-30T09:16:19.127Z, "tags"=>["_rubyexception"], "@version"=>"1", "message"=>"x^\\x8B\\xAEV*\\xCE\\xCE\\xCC\\xC9)V\\xB2R\\x88V\\xD26T07\\xB7\\xB0\\xB4\\xB44000W\\x8A\\xD5QPJ\\xCE\\xCF+IL.\\u0001\\xCA*)\\u0001\\xB9\\xA9\\xB9\\x89\\x999 N\\x96C\\x96^r~.X,\\xA5\\u0014(R\\xADT\\x9A\\u000E6#\\xA0\\xB2$#?\\u000F\\xAC\\xB9\\u0000\\\"\\xE2\\u001C\\xAC\\u0014[\\v\\xE4\\xE6%概\\xF4z\\u0001\\xE9b%\\xA0\\xC8\\xC0\\xD9\\u001D\\v\\u0000\\u0003\\x9ADk", "type"=>"kitlog-rabbitmq"}}

Was trying different gzipping methods, but result is still the same. Also tried changing input codecs (plain - utf-8, plain - binary)

score 1 · Accepted Answer · answered Nov 27 '18 at 19:37

So the content in rabbitmq is gzipped?

In the best of all possible worlds, logstash would see the content-encoding header and unzip it for you, but the plugin doesn't seem to do anything with that knowledge. You might request the feature.

The plugin does let you access the header, so you could do the gzip yourself. Something like this:

filter {
  if [@metadata][rabbitmq_properties][content-encoding] == "gzip" {
    ruby {
      ...
    }
  }
}

Examples of unzipping a string with ruby exist elsewhere. Hopefully the 'zip' gem is available in logstash.

Thank you, I will give it a try! – ikebastuz Nov 28 '18 at 10:06 — ikebastuz, Nov 28 '18 at 10:06

Logstash un-gzip array log configuration

1 Answers1