1

I am using log4js to log data to a file in my app. I want to display some of this data in my Grafana dashboard and for that I am using Promtail to read logs from the file, pre-process it and send it to Loki. In Loki, I want to filter the data based on the parsed values.

Here is an example of my logs:

[2023-02-12T04:01:23.587] [DEBUG] default - {
  "message_id": 123,
  "from": {
    "id": 123,
    "is_bot": false,
    "first_name": "XXX",
    "last_name": "XXXXX",
    "username": "XXXXX",
    "is_premium": true
  },
  "chat": {
    "id": 123,
    "title": "XXXXX",
    "username": "XXXXX",
    "type": "supergroup"
  },
  "date": 123,
  "message_thread_id": 123,
  "text": "XXX XXXXX XXXXX"
}

Here is my current Promtail configuration:

server:
  http_listen_port: 80
  grpc_listen_port: 9095
  log_level: debug

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://192.168.1.64:3100/loki/api/v1/push

scrape_configs:
- job_name: patriotbot
  pipeline_stages:
    - multiline:
        firstline: \[\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}\] 
    - regex:
        expression: '^\[.*\]\s\w*\s-\s(\{.*\})'
        name: log_entry
    - json:
        source: log_entry
        name: only_json
        expressions:
          from_id: 'from.id'
          is_bot: 'from.is_bot'
          first_name: 'from.first_name'
          last_name: 'from.last_name'
          username: 'from.username'
          chat_id: 'chat.id'
          chat_title: 'chat.title'
          chat_type: 'chat.type'
    - output:
        source: only_json
             
  static_configs:
  - targets:
      - localhost
    labels:
      job: patriot
      type: all
      __path__: /logs/all.log

I have two issues with my current configuration:

  • My logs are saved as multiple lines, making them difficult to parse. I have attempted to fix this issue with the multiline stage, but which is displayed properly in Loki, though maybe it won't work for following parsing.

  • The timestamp and metadata in front of the JSON object is preventing it from being parsed properly. Should I get rid it before sending it to the pipe and parse?

So can somebody suggest changes to my configuration that would allow me to properly parse these multiline logs and extract the relevant data?

Max Zavodniuk
  • 79
  • 1
  • 7

1 Answers1

0

Check this out. Might be some typos but unable to check it now. Assume that fistline matching is ok. You should use regex to split the string to few values - e.g. time, loglevel, something. Next stage is to extract timestamp from time, adding label loglevel (don't know if it useful) and json section left untouched.

server:
  http_listen_port: 80
  grpc_listen_port: 9095
  log_level: debug

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://192.168.1.64:3100/loki/api/v1/push

scrape_configs:
- job_name: patriotbot
  pipeline_stages:
    - multiline:
        firstline: \[\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}\] 
        max_wait_time: 3s
    - regex:
        # expression: '^\[.*\]\s\w*\s-\s(\{.*\})'
        expression: "^\\[(?P<time>\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}.\\d{3}\\] \\[(?P<loglevel>\\S+)\\] (?P<something>\\S+) - (?P<log_entry>(?s:.*))$"
        #name: log_entry
    - timestamp:
        source: time
    - label:
        loglevel:
    - json:
        source: log_entry
        name: only_json
        expressions:
          from_id: 'from.id'
          is_bot: 'from.is_bot'
          first_name: 'from.first_name'
          last_name: 'from.last_name'
          username: 'from.username'
          chat_id: 'chat.id'
          chat_title: 'chat.title'
          chat_type: 'chat.type'
    - output:
        source: only_json
             
  static_configs:
  - targets:
      - localhost
    labels:
      job: patriot
      type: all
      __path__: /logs/all.log
molu8bits
  • 61
  • 1
  • 3