0

I'm trying to dissect the log message and pattern shown in the following error. I validated my input using an dissect-tester by jorgelbg where it works without any issues.

I think its especially strange that the delimiter is empty (``).

# Dissect Pattern
%{level} : %{timestamp} [%{class}] (%{file}) - %{message}
# Log Message
INFO : 2023-06-30 10:30:53,208 [d.f.f.w.c.c.GeoServerHelper] (GeoServerHelper.java:208) - Response erhalten vom GeoServer in 2269ms
# Error
2023-06-30T10:31:32.515Z        DEBUG   [processors]    processing/processors.go:128    Fail to apply processor client{dissect=%{level} : %{timestamp} [%{class}] (%{file}) - %{message},field=message,target_prefix=, timestamp=[field=timestamp, target_field=@timestamp, timezone=UTC, layouts=[2006-01-02 15:04:05,999]], add_tags=tag}: could not find delimiter: `` in remaining: `INFO : 2023-06-30 10:30:53,208 [d.f.f.w.c.c.GeoServerHelper] (GeoServerHelper.java:208) - Response erhalten vom GeoServer in 2269ms`, (offset: 0)

Is there anything I've got wrong about the pattern?

EDIT

I'm using the following filebeat config:

- type: filestream
  id: converter-logstream-id
  paths:
    - /logs/converter/*.log
  prospector.scanner.exclude_files:
    ["^/logs/content2alert-converter/transfer.log"]
  fields:
    input_source: converter
  processors:
    - dissect:
        tokenizer: "%{level} : %{timestamp} [%{class}] (%{file}) - %{message}"
        # field: "message"
        target_prefix: ""
        # trim_values: left
        overwrite_keys: true
    - timestamp:
        field: timestamp
        layouts:
          - "2006-01-02 15:04:05,999"
        test:
          - "2023-06-28 09:30:14,208"
    - add_tags:
        tags: [converter]
        target: "group"

2

When I'm printing to console instead of elasticsearch I'm getting the following reponse:

{
  "@timestamp": "2023-06-30T12:58:18.067Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.17.6"
  },
  "group": [
    "converter"
  ],
  "ecs": {
    "version": "1.12.0"
  },
  "host": {
    "name": "host"
  },
  "agent": {
    "name": "host",
    "type": "filebeat",
    "version": "7.17.6",
    "hostname": "filebeat-wind-log-1-cd5mq",
    "ephemeral_id": "0bd88f31-8872-4842-bc1b-cb9f11eb63f0",
    "id": "571b3737-67bb-458d-934a-549204b017f5"
  },
  "log": {
    "offset": 1386,
    "file": {
      "path": "/logs/converter/converter-debug.log"
    },
    "flags": [
      "dissect_parsing_error"
    ]
  },
  "message": "\u001b[34mINFO \u001b[0;39m: 2023-06-30 06:43:10,980 \u001b[1;30m[d.f.f.w.c.c.GeoServerHelper] (GeoServerHelper.java:208)\u001b[0;39m - Response erhalten vom GeoServer in 3397ms",
  "input": {
    "type": "filestream"
  },
  "fields": {
    "input_source": "converter"
  }
}

Looks like there is some issue with the characters.

Solution

I ended up using a custom processor javascript script as recommend in this thread to remove the unicode colors (thanks @Val). The used logger is logback but I can't change it there.

In case anyone else is in the same situation, this is the script I'm using which is based on this answer.

- script:
    lang: javascript
    source: >
      function process(event) {
        var originalMsg = event.Get('message')
        var msg = originalMsg.replace(/\x1b\[([0-9,A-Z]{1,2}(;[0-9]{1,2})?(;[0-9]{3})?)?[m|K]?/g, '');
        event.Put("message", msg);
      }
Calipee
  • 173
  • 1
  • 7
  • Can you show the definition of your ingest pipeline? – Val Jun 30 '23 at 11:24
  • I've added the filebeat config I'm using @Val – Calipee Jun 30 '23 at 12:37
  • 1
    Good job with the script processor, but I'm sure you can configure the logback pattern to [remove those colors](https://logback.qos.ch/manual/layouts.html#coloring) – Val Jun 30 '23 at 14:13
  • 1
    Yes your are right. I meant to say I'm not able to change it because I only have access to the logs, but not to the project producing them :) – Calipee Jul 12 '23 at 15:06

2 Answers2

1

Tldr;

It seems to work just fine on my side. I am running the elastic stack in version 8.7

To reproduce

Filebeat pipeline

Using this configuration with filebeat 7.17.6

filebeat.inputs:
- type: filestream
  id: srt
  paths:
    - /usr/share/filebeat/*.log

processors:
  - dissect:
      tokenizer: "%{level} : %{timestamp} [%{class}] (%{file}) - %{message}"
      field: "message"
      
output.console:
  pretty: true

You get

{
  "@timestamp": "2023-06-30T12:43:32.763Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.17.6"
  },
  "agent": {
    "type": "filebeat",
    "version": "7.17.6",
    "hostname": "ba94ba6326d9",
    "ephemeral_id": "4163b2cd-e263-4be0-995a-fcfcf340622a",
    "id": "447b1b2a-986d-4441-98d6-cd88e74cb320",
    "name": "ba94ba6326d9"
  },
  "ecs": {
    "version": "1.12.0"
  },
  "dissect": {
    "message": "Response erhalten vom GeoServer in 2269ms",
    "level": "INFO",
    "timestamp": "2023-06-30 10:30:53,208",
    "class": "d.f.f.w.c.c.GeoServerHelper",
    "file": "GeoServerHelper.java:208"
  },
  "log": {
    "offset": 0,
    "file": {
      "path": "/usr/share/filebeat/data.log"
    }
  },
  "message": "INFO : 2023-06-30 10:30:53,208 [d.f.f.w.c.c.GeoServerHelper] (GeoServerHelper.java:208) - Response erhalten vom GeoServer in 2269ms",
  "input": {
    "type": "filestream"
  },
  "host": {
    "name": "ba94ba6326d9"
  }
}

Elasticsearch ingest pipeline

POST /_ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "_description",
    "processors": [
      {
        "dissect": {
          "field": "data",
          "pattern": """%{level} : %{timestamp} [%{class}] (%{file}) - %{message}"""
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "data": "INFO : 2023-06-30 10:30:53,208 [d.f.f.w.c.c.GeoServerHelper] (GeoServerHelper.java:208) - Response erhalten vom GeoServer in 2269ms"
      }
    }
  ]
}

This should give you the following

{
  "docs": [
    {
      "doc": {
        "_index": "index",
        "_id": "id",
        "_version": "-3",
        "_source": {
          "file": "GeoServerHelper.java:208",
          "data": "INFO : 2023-06-30 10:30:53,208 [d.f.f.w.c.c.GeoServerHelper] (GeoServerHelper.java:208) - Response erhalten vom GeoServer in 2269ms",
          "message": "Response erhalten vom GeoServer in 2269ms",
          "level": "INFO",
          "class": "d.f.f.w.c.c.GeoServerHelper",
          "timestamp": "2023-06-30 10:30:53,208"
        },
        "_ingest": {
          "timestamp": "2023-06-30T11:57:26.127545753Z"
        }
      }
    }
  ]
}
Paulo
  • 8,690
  • 5
  • 20
  • 34
1

The problem is that you have unicode colors in your logs such as \u001b[34mINFO \u001b and that doesn't match the dissect.

You first need to fix the logger that creates those log files to not issue those color characters and then it will work.

Val
  • 207,596
  • 13
  • 358
  • 360