3

I have ECK setup and im using filebeat to ship logs from Kubernetes to elasticsearch.

Ive recently added decode_json_fields processor to my configuration, so that im able decode the json that is usually in the message field.

      - decode_json_fields:
          fields: ["message"]
          process_array: false
          max_depth: 10
          target: "log"
          overwrite_keys: true
          add_error_key: true

However logs have stopped appearing since adding it.

example log:

{
  "_index": "filebeat-7.9.1-2020.10.01-000001",
  "_type": "_doc",
  "_id": "wF9hB3UBtUOF3QRTBcts",
  "_score": 1,
  "_source": {
    "@timestamp": "2020-10-08T08:43:18.672Z",
    "kubernetes": {
      "labels": {
        "controller-uid": "9f3f9d08-cfd8-454d-954d-24464172fa37",
        "job-name": "stream-hatchet-cron-manual-rvd"
      },
      "container": {
        "name": "stream-hatchet-cron",
        "image": "<redacted>.dkr.ecr.us-east-2.amazonaws.com/stream-hatchet:v0.1.4"
      },
      "node": {
        "name": "ip-172-20-32-60.us-east-2.compute.internal"
      },
      "pod": {
        "uid": "041cb6d5-5da1-4efa-b8e9-d4120409af4b",
        "name": "stream-hatchet-cron-manual-rvd-bh96h"
      },
      "namespace": "default"
    },
    "ecs": {
      "version": "1.5.0"
    },
    "host": {
      "mac": [],
      "hostname": "ip-172-20-32-60",
      "architecture": "x86_64",
      "name": "ip-172-20-32-60",
      "os": {
        "codename": "Core",
        "platform": "centos",
        "version": "7 (Core)",
        "family": "redhat",
        "name": "CentOS Linux",
        "kernel": "4.9.0-11-amd64"
      },
      "containerized": false,
      "ip": []
    },
    "cloud": {
      "instance": {
        "id": "i-06c9d23210956ca5c"
      },
      "machine": {
        "type": "m5.large"
      },
      "region": "us-east-2",
      "availability_zone": "us-east-2a",
      "account": {
        "id": "<redacted>"
      },
      "image": {
        "id": "ami-09d3627b4a09f6c4c"
      },
      "provider": "aws"
    },
    "stream": "stdout",
    "message": "{\"message\":{\"log_type\":\"cron\",\"status\":\"start\"},\"level\":\"info\",\"timestamp\":\"2020-10-08T08:43:18.670Z\"}",
    "input": {
      "type": "container"
    },
    "log": {
      "offset": 348,
      "file": {
        "path": "/var/log/containers/stream-hatchet-cron-manual-rvd-bh96h_default_stream-hatchet-cron-73069980b418e2aa5e5dcfaf1a29839a6d57e697c5072fea4d6e279da0c4e6ba.log"
      }
    },
    "agent": {
      "type": "filebeat",
      "version": "7.9.1",
      "hostname": "ip-172-20-32-60",
      "ephemeral_id": "6b3ba0bd-af7f-4946-b9c5-74f0f3e526b1",
      "id": "0f7fff14-6b51-45fc-8f41-34bd04dc0bce",
      "name": "ip-172-20-32-60"
    }
  },
  "fields": {
    "@timestamp": [
      "2020-10-08T08:43:18.672Z"
    ],
    "suricata.eve.timestamp": [
      "2020-10-08T08:43:18.672Z"
    ]
  }
}

In the filebeat logs i can see the following error:

2020-10-08T09:25:43.562Z WARN [elasticsearch] elasticsearch/client.go:407 Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0x36b243a0, ext:63737745936, loc:(*time.Location)(nil)}, Meta:null, Fields:{"agent":{"ephemeral_id":"5f8afdba-39c3-4fb7-9502-be7ef8f2d982","hostname":"ip-172-20-32-60","id":"0f7fff14-6b51-45fc-8f41-34bd04dc0bce","name":"ip-172-20-32-60","type":"filebeat","version":"7.9.1"},"cloud":{"account":{"id":"700849607999"},"availability_zone":"us-east-2a","image":{"id":"ami-09d3627b4a09f6c4c"},"instance":{"id":"i-06c9d23210956ca5c"},"machine":{"type":"m5.large"},"provider":"aws","region":"us-east-2"},"ecs":{"version":"1.5.0"},"host":{"architecture":"x86_64","containerized":false,"hostname":"ip-172-20-32-60","ip":["172.20.32.60","fe80::af:9fff:febe:dc4","172.17.0.1","100.96.1.1","fe80::6010:94ff:fe17:fbae","fe80::d869:14ff:feb0:81b3","fe80::e4f3:b9ff:fed8:e266","fe80::1c19:bcff:feb3:ce95","fe80::fc68:21ff:fe08:7f24","fe80::1cc2:daff:fe84:2a5a","fe80::3426:78ff:fe22:269a","fe80::b871:52ff:fe15:10ab","fe80::54ff:cbff:fec0:f0f","fe80::cca6:42ff:fe82:53fd","fe80::bc85:e2ff:fe5f:a60d","fe80::e05e:b2ff:fe4d:a9a0","fe80::43a:dcff:fe6a:2307","fe80::581b:20ff:fe5f:b060","fe80::4056:29ff:fe07:edf5","fe80::c8a0:5aff:febd:a1a3","fe80::74e3:feff:fe45:d9d4","fe80::9c91:5cff:fee2:c0b9"],"mac":["02:af:9f:be:0d:c4","02:42:1b:56:ee:d3","62:10:94:17:fb:ae","da:69:14:b0:81:b3","e6:f3:b9:d8:e2:66","1e:19:bc:b3:ce:95","fe:68:21:08:7f:24","1e:c2:da:84:2a:5a","36:26:78:22:26:9a","ba:71:52:15:10:ab","56:ff:cb:c0:0f:0f","ce:a6:42:82:53:fd","be:85:e2:5f:a6:0d","e2:5e:b2:4d:a9:a0","06:3a:dc:6a:23:07","5a:1b:20:5f:b0:60","42:56:29:07:ed:f5","ca:a0:5a:bd:a1:a3","76:e3:fe:45:d9:d4","9e:91:5c:e2:c0:b9"],"name":"ip-172-20-32-60","os":{"codename":"Core","family":"redhat","kernel":"4.9.0-11-amd64","name":"CentOS Linux","platform":"centos","version":"7 (Core)"}},"input":{"type":"container"},"kubernetes":{"container":{"image":"700849607999.dkr.ecr.us-east-2.amazonaws.com/stream-hatchet:v0.1.4","name":"stream-hatchet-cron"},"labels":{"controller-uid":"a79daeac-b159-4ba7-8cb0-48afbfc0711a","job-name":"stream-hatchet-cron-manual-c5r"},"namespace":"default","node":{"name":"ip-172-20-32-60.us-east-2.compute.internal"},"pod":{"name":"stream-hatchet-cron-manual-c5r-7cx5d","uid":"3251cc33-48a9-42b1-9359-9f6e345f75b6"}},"log":{"level":"info","message":{"log_type":"cron","status":"start"},"timestamp":"2020-10-08T09:25:36.916Z"},"message":"{"message":{"log_type":"cron","status":"start"},"level":"info","timestamp":"2020-10-08T09:25:36.916Z"}","stream":"stdout"}, Private:file.State{Id:"native::30998361-66306", PrevId:"", Finished:false, Fileinfo:(*os.fileStat)(0xc001c14dd0), Source:"/var/log/containers/stream-hatchet-cron-manual-c5r-7cx5d_default_stream-hatchet-cron-4278d956fff8641048efeaec23b383b41f2662773602c3a7daffe7c30f62fe5a.log", Offset:539, Timestamp:time.Time{wall:0xbfd7d4a1e556bd72, ext:916563812286, loc:(*time.Location)(0x607c540)}, TTL:-1, Type:"container", Meta:map[string]string(nil), FileStateOS:file.StateOS{Inode:0x1d8ff59, Device:0x10302}, IdentifierName:"native"}, TimeSeries:false}, Flags:0x1, Cache:publisher.EventCache{m:common.MapStr(nil)}} (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse field [log.message] of type [keyword] in document with id '56aHB3UBLgYb8gz801DI'. Preview of field's value: '{log_type=cron, status=start}'","caused_by":{"type":"illegal_state_exception","reason":"Can't get text on a START_OBJECT at 1:113"}}

It throws an error because apparently log.message is of type "keyword" however this does not exist in the index mapping.

I thought this maybe an issue with the "target": "log" so ive tried changing this to something arbitrary like "my_parsed_message" or "m_log" or "mlog" and i get the same error for all of them.

{"type":"mapper_parsing_exception","reason":"failed to parse field [mlog.message] of type [keyword] in document with id 'J5KlDHUB_yo5bfXcn2LE'. Preview of field's value: '{log_type=cron, status=end}'","caused_by":{"type":"illegal_state_exception","reason":"Can't get text on a START_OBJECT at 1:217"}}

Elastic version: 7.9.2

Kay
  • 17,906
  • 63
  • 162
  • 270
  • Can you share the mapping of your index please? `GET filebeat-7.9.1-2020.10.01-000001` Due to big amount of ECS fields, it might be big, so you can share that through [gist](https://gist.github.com/) if you prefer – Val Oct 14 '20 at 07:38
  • @Val https://gist.github.com/kaykhancheckpoint/38f771f22d5f26e919c4e6881bea2e4e I also want to point out that these filebeat index's are automatically created by filebeat and i have not modified them. Note the date has changed on the latest filebeat index. – Kay Oct 14 '20 at 07:54

1 Answers1

1

The problem is that some of your JSON messages contain a message field that is sometimes a simple string and other times a nested JSON object (like in the case you're showing in your question).

After this index was created, the very first message that was parsed was probably a string and hence the mapping has been modified to add the following field (line 10553):

"mlog": {
   "properties": {
       ...
       "message": {
          "type": "keyword",
          "ignore_above": 1024
       },
   }
}

You'll find the same pattern for my_parsed_message (line 10902), my_parsed_logs (line 10742), etc...

Hence the next message that comes with message being a JSON object, like

{"message":{"log_type":"cron","status":"start"}, ...

will not work because it's an object, not a string...

Looking at the fields of your custom JSON, it seems you don't really have the control over either their taxonomy (i.e. naming) or what they contain...

If you're serious about willing to search within those custom fields (which I think you are since you're parsing the field, otherwise you'd just store the stringified JSON), then I can only suggest to start figuring out a proper taxonomy in order to make sure that they all get a standard type.

If all you care about is logging your data, then I suggest to simply disable the indexing of that message field. Another solution is to set dynamic: false in your mapping to ignore those fields, i.e. not modify your mapping.

Val
  • 207,596
  • 13
  • 358
  • 360
  • "start figuring out a proper taxonomy in order to make sure that they all get a standard type." i don't think i fully understand this point, the applications i write will always have logs printed in json format, but sometimes aws on its own or other applications might log just a string. So in this case i want to parse the message if its json or ignore it if its not json – Kay Oct 14 '20 at 09:18
  • 1
    Is there a way to disable indexing of that message field or setting dyanmic: false using the filebeat config? Or is this something i would ahve to manually set each time for every index. (indecies are created automatically every month) – Kay Oct 14 '20 at 09:22
  • If you're willing to ditch the internal `message` field if it's a string, then you can easily do so with another `drop` processor before the event goes out to ES. However, the `message` field is just one of the multitude of fields you have and without a proper taxonomy, chances are high that this same problem is bound to happen for another field. – Val Oct 14 '20 at 09:26
  • nevermind I don't want to ditch it, i just don't need it to parse json. I still don't understand what you mean by Taxonomy. Message is a core field on that index right? I leave message as is, and i use a target field to store my decoded json. – Kay Oct 14 '20 at 09:35
  • The problem is not the top-level `message` field, that one is core and contains a string (i.e. the full unparsed JSON message). The problem is the `message` field that is present within the stringified JSON. That `message` field cannot be sometimes a string, sometimes another JSON object, that's the whole problem here, because the target field will not now how to parse it. – Val Oct 14 '20 at 09:37
  • Right, that message field will sometimes be a string and sometimes be a json because i don't have control over how other applications in my k8 cluster log. So i need someway to decode the json keeping into account that sometimes it might be a string sometimes it might be a json – Kay Oct 14 '20 at 09:38
  • There you go! Do you need to search on the parsed content? – Val Oct 14 '20 at 09:39
  • I do need to search on the parsed content, its one of the usecases as to why im decoding the json so that i can search on prased content – Kay Oct 14 '20 at 09:39
  • Then you have a problem and the solution is a bit more involved – Val Oct 14 '20 at 09:39
  • So disabling the indexing or setting dynamic: false won't work here because i want to search on the parsed content? – Kay Oct 14 '20 at 09:42
  • Correct, but at the same time you cannot index sometimes a string, sometimes JSON content in the same field... chicken/egg – Val Oct 14 '20 at 09:45
  • Do you have a way to easily identify each application that is logging (via k8s labels, container name, etc)? Do you now approximately how may applications are logging? – Val Oct 14 '20 at 09:59
  • Ye we can add metadata labels to kubernetes applications. https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ Are you thinking of adding a specific label to an application that we write? So it can be identified later on? – Kay Oct 14 '20 at 10:52
  • for example only run processor decode_json_fields if there is a certain label or container name – Kay Oct 14 '20 at 10:58
  • Yes, that's what I'm hinting at. If something allows you to distinguish where the logs are coming from, it'll be easier to apply different parsing strategies to those logs. – Val Oct 14 '20 at 11:29
  • No i don't have time to attempt this until next week now :(, working on something else, i will come back to it next week – Kay Oct 16 '20 at 14:06
  • I tested it seem to have worked, don't know how i can set a list of container names though instead of just 1 ``` - decode_json_fields: when: equals: kubernetes.container.name: "entity-extractor-api-prod" fields: ["message"] process_array: false max_depth: 10 target: "log" overwrite_keys: true add_error_key: true ``` – Kay Oct 17 '20 at 08:43
  • although ive not properly tested the solution and i don't think i will have time to before the bounty runs out i will give it you as you helped me understand the problem better and walk through it – Kay Oct 17 '20 at 08:46
  • Thanks, no worries. When you have time, just chime in here and we'll get it done ;-) – Val Oct 17 '20 at 15:18