0

I'm having some trouble understanding VRL and using vector.dev for transforming Django logs and uploading them to Elasticsearch. I have a Django log file with the following format:

Django 2023-05-25 16:00:20,714 [WARNING] django.request: Not Found: /todos/vinrrt/
Django {timestamp} {level} {module} {message}

want to transform these logs so that I can upload them to Elasticsearch. Here's what I have done so far:

data_dir = "/home/abc/log"

[sources.django_logs]
type = "file"
include = ["/home/abc/log/django.log"]

[transforms.django_logs_transform]
type = "remap"
inputs = ["django_logs"]
source = """
structured =
  parse_syslog(.message) ??
  parse_common_log(.message) ??
  parse_regex!(.message, r'Django (?P<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) \[(?P<level>[^\]]+)\] (?P<module>[^:]+): (?P<message>.+)')
. = merge(., structured)"""

#[sinks.django_log_sink]
#type = "elasticsearch"
#inputs = ["django_logs_transform"]
#hosts = ["http://localhost:9200/"]
#index = "django_logs"

[sinks.django_log_sink_log]
type = "file"
inputs = ["django_logs_transform"]
path = "/home/abc/log/django-output.log"
encoding.codec  = "logfmt"

I'm struggling with the regular expression in the remap transform function. Could someone please help me with the regex pattern?

I am expecting a valid transform block which could label my logs so that it could go to elastic search

InSync
  • 4,851
  • 4
  • 8
  • 30
  • Difficult to say, but I would personally replace the space by `\s` as it may be some tabs or several spaces for aligning columns. I would also set the milliseconds part of the time to be optional, as some log formats just stop at the precision of a second. Have a try with this pattern: https://regex101.com/r/Dsq1WF/1 – Patrick Janser May 30 '23 at 12:18

0 Answers0