0

I got a filename in the format <key>:<value>-<key>:<value>.log like e.g. pr:64-author:mxinden-platform:aws.log containing logs of a test run.

I want to stream each line of the file to elasticsearch via logstash. Each line should be treated as a separate document. Each document should get the fields according to the filename. So e.g. for the above example let's say log-line 17-12-07 foo something happened bar would get the fields: pr with value 64, author with value mxinden and platform with value aws.

At the point in time, where I write the logstash configuration I do not know the names of the fields.

How do I dynamically add fields to each line based on the fields contained in the filename?

The static approach so far is:

filter {
  mutate { add_field => { "file" => "%{[@metadata][s3][key]}"} }
  else {
    grok { match => { "file" => "pr:%{NUMBER:pr}-" } }
    grok { match => { "file" => "author:%{USERNAME:author}-" } } 
    grok { match => { "file" => "platform:%{USERNAME:platform}-" } }
  }
}

Changes to the filename structure are fine.

mxinden
  • 147
  • 1
  • 11
  • I'm not 100% sure if I understand your question, but can't you just match your filepath as described [here](https://stackoverflow.com/questions/22916200/logstash-how-to-add-file-name-as-a-field?rq=1). Something like `filter { grok { match => ["path","%{GREEDYDATA}%{WORD}:%{NUMBER:pr}-%{WORD}:%{WORD:author}-%{WORD}:%{WORD:platform}.log"] } }` could work for you... – Phonolog Dec 08 '17 at 12:35
  • @Phonolog But that would imply, that I know, which keys are in the filename. I don't know the keys like e.g. `author`, `platform`, `pr`, ..., at that point in time. I would like to determine the keys dynamically. My use case is, that I want to be able to add new fields to the filename without touching the Logstash configuration. – mxinden Dec 08 '17 at 17:35
  • 2
    The way I would do it is to get the whole filename into a field (excluding the .log) then use the kv filter to split on that field with the field split = "-" and value split = ":" – Dan Griffiths Dec 08 '17 at 23:01
  • @mxinden Alright I got it now... Dan's solution sounds like a good way to go then. – Phonolog Dec 09 '17 at 09:13

1 Answers1

0

Answering my own question based on @dan-griffiths comment:

Solution for a file like pr=64,author=mxinden,platform=aws.log is to use the Elasticsearch kv filter like e.g.:

  filter {
    kv {
      source => "file"
      field_split => ","
    }
  }

where file is a field extracted from the filename via the AWS S3 input plugin.

mxinden
  • 147
  • 1
  • 11