0

I am sending data from local log files with filebeat to graylog and I am getting a 20x storage overhead compared to the original files. There are a large amount of metadata fields however I can't seem to get rid of them. I have tried many variations of removing fields such as:

processors:
  - drop_fields:
      fields: ["ecs.version", "agent.version", "agent.type", "agent.id", "agent.hostname", "input.type"]

Do any of you have any recommendation of how to strip everthing except the timestamp and raw log that has been sent? I do not need anything like id or agent type because they are all coming from the same place.

HBruijn
  • 77,029
  • 24
  • 135
  • 201
  • It looks like your trying to remove those fields in filebeat/journalbeat ; rather than getting the agent to go against its intended behaviour the trick is probably to drop unwanted fields in Graylog instead like https://go2docs.graylog.org/5-0/making_sense_of_your_log_data/functions_descriptions.html#removefield and/or the old message https://community.graylog.org/t/how-to-add-remove-additional-fields-in-graylog/9445 – HBruijn Jun 22 '23 at 17:59

1 Answers1

1

There are 2 metadata fields required by Graylog in order for Sidecar to function properly: collector_node_id and node.name, so you can't remove those.

Filebeat itself prevents removal of the @timestamp field.

And the @metadata_* fields are set by Logstash and cannot be removed.

Other than this, you can remove all other Filebeat fields. Here's a config snippet you can use to do so in your filebeat.yml:

processors:
  - drop_fields:
      fields: [
          "agent.ephemeral_id",
          "agent.id",
          "agent.type",
          "agent.version",
          "ecs.version",
          "host.name",
          "input.type",
          "log.file.path",
          "log.offset"
      ]
Willman
  • 155
  • 1
  • 10