5

how can I control level of decode_json_fields ?

max_depth seems not help in my case.

goal: parsing '/var/lib/docker/containers//.log' but controlling max json depth (not to generate hundreds of nested fields in elasticsearch index)

name: "host-01"
queue:
  mem:
    events: 16384
    # batch of events to the outputs. "0" ensures events are immediately available to be sent to the outputs.
    flush.min_events: 0


filebeat:
  prospectors:
    - type: log
      paths:
       - '/tmp/test.log'
      json:
        # key on which to apply the line filtering and multiline settings
        message_key: log
        keys_under_root: true
        add_error_key: true
      processors:
      - decode_json_fields:
          fields: ["log"]
          process_array: false
          max_depth: 1
          overwrite_keys: false

output:
  console:
    pretty: true

Example

echo '{"log":"{ "status": { "foo": { "bar": 1 } }, "bytes_sent": "0", "gzip_ratio": "-", "hostname": "cb7b5441f0da" }\n","stream":"stdout","time":"2018-12-29T11:25:36.130729806Z"}' >> /tmp/test.log

Actual result:

{
...
  "log": {
    "status": {
      "foo": {
        "bar": 1
      }
    },
    "bytes_sent": "0",
    "gzip_ratio": "-",
    "hostname": "cb7b5441f0da"
...
}

Expected result:

{
...
  "log": {
    "status": "{  \"foo\": { \"bar\": 1 } }"
   },
  "bytes_sent": "0",
  "gzip_ratio": "-",
  "hostname": "cb7b5441f0da"
...
}

How to control nested json objects?

here is some explanation https://github.com/elastic/beats/issues/9834#issuecomment-451134008 1 but removing json: and leave only decode_json_fields doesn't help

crosslink to discuss.elastic.co https://discuss.elastic.co/t/filebeat-how-control-level-nested-json-object-parsing-decode-json-fields/162876

AZ-
  • 113
  • 1
  • 10
  • 1
    We have the same requirements (apps may log deeply nested JSON, we don't want everything parsed, just top level). I'm able to reproduce the described behavior. This seems like a bug to me. – Kristoffer Bakkejord Mar 06 '19 at 11:15

1 Answers1

0

As of 2022 the filebeat decode_json_fields processor is still not able to cater to this requirement:

Parsing JSON document keys only up to Nth depth and leave deeper JSON keys as unparsed strings.

There's an open issue in elastic/beats github repository discussing the max_depth property behaviour of the decode_json_fields processor where a workaround was kindly provided by a participant in the thread leveraging the script filebeat processor.

- script:
    lang: javascript
    source: >
      function process(event) {
          for(var p in event.Get("log")){
            if (event.Get("log")[p] != null && typeof event.Get("log")[p] == 'object') {
              event.Put("log."+p, JSON.stringify(event.Get("log")[p]))
            }
          }
      }

PS: I have changed the original snippet root JSON key to "log" to meet the OP requirement.

Alexandre Juma
  • 3,128
  • 1
  • 20
  • 46