1

I want to have index in ElasticSearch as the name of an uploaded file. I followed the answer suggested here - Logstash filename as ElasticSearch index. However, instead of receiving the index as file name, I get the index of what is exactly between the quotes - %{index_name}. What am I doing wrong?

Update - my syslog.conf:

input {
  beats {
    port => 5044
  }
  udp {
    port => 514
    type => "syslog"
  }
  file {
       path => "C:\web-developement\...\data\*.log"
       start_position => "beginning"
       type => "logs"
   }
}

filter {    
    grok {
     match => ["path", "data/%{GREEDYDATA:index_name}" ]
    }
}

output {
  elasticsearch { 
      hosts => ["localhost:9200"] 
      index => "%{index_name}"
      manage_template => false
  }
  stdout { codec => rubydebug }
}

UPD 2 - Logstash output:

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.headius.backport9.modules.Modules (file:/C:/ELK/logstash-7.6.2/logstash-7.6.2/logstash-core/lib/jars/jruby-complete-9.2.9.0.jar) to field java.io.Console.cs
WARNING: Please consider reporting this to the maintainers of com.headius.backport9.modules.Modules
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Sending Logstash logs to C:/ELK/logstash-7.6.2/logstash-7.6.2/logs which is now configured via log4j2.properties
[2020-06-10T17:37:34,552][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-06-10T17:37:34,670][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.6.2"}
[2020-06-10T17:37:36,144][INFO ][org.reflections.Reflections] Reflections took 37 ms to scan 1 urls, producing 20 keys and 40 values
[2020-06-10T17:37:37,535][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2020-06-10T17:37:37,704][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2020-06-10T17:37:37,752][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}
[2020-06-10T17:37:37,755][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
[2020-06-10T17:37:37,838][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}
[2020-06-10T17:37:38,020][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.specialized.RubyArrayOneObject) has been created for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-06-10T17:37:38,025][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, "pipeline.sources"=>["C:/ELK/logstash-7.6.2/logstash-7.6.2/config/syslog.conf"], :thread=>"#<Thread:0x577ae8e run>"}
[2020-06-10T17:37:38,798][INFO ][logstash.inputs.beats    ][main] Beats inputs: Starting input listener {:address=>"0.0.0.0:5044"}
[2020-06-10T17:37:39,237][INFO ][logstash.inputs.file     ][main] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"C:/ELK/logstash-7.6.2/logstash-7.6.2/data/plugins/inputs/file/.sincedb_029446dc83f19d43b8822e485aa6e7a4", :path=>["C:\\web-developement\\project\\data\\*.log"]}
[2020-06-10T17:37:39,263][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-06-10T17:37:39,312][INFO ][logstash.inputs.udp      ][main] Starting UDP listener {:address=>"0.0.0.0:514"}
[2020-06-10T17:37:39,353][INFO ][filewatch.observingtail  ][main] START, creating Discoverer, Watch with file and sincedb collections
[2020-06-10T17:37:39,378][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-06-10T17:37:39,390][INFO ][logstash.inputs.udp      ][main] UDP listener started {:address=>"0.0.0.0:514", :receive_buffer_bytes=>"65536", :queue_size=>"2000"}
[2020-06-10T17:37:39,404][INFO ][org.logstash.beats.Server][main] Starting server on port: 5044
[2020-06-10T17:37:39,665][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

Also, the output, when log file was updated give me the line _grokparsefailure.

...
 "@timestamp" => 2020-06-10T14:44:11.390Z,
         "event" => {
        "timezone" => "+03:00",
         "dataset" => "logstash.log",
          "module" => "logstash"
    },
          "tags" => [
        [0] "beats_input_codec_plain_applied",
        [1] "_grokparsefailure"
    ],
      "@version" => "1"
}
Annie H.
  • 209
  • 2
  • 4
  • 16
  • 1
    Update your question with your full logstash pipeline. If you are getting `%{index_name}` as the name, you probably do not have a field named `index_name`, but to be sure you need to share your logstash pipeline. – leandrojmp Jun 08 '20 at 14:38
  • I've added the syslog.conf. Please let me know if should provide more information – Annie H. Jun 08 '20 at 18:41
  • 1
    Is your grok working? You are using windows path in your file filter input, which uses backward slashes, \, but your grok is trying to match a path that uses forward slashes, /, this will never match and you will never have the field `index_name`, you need to change your grok. Also, index names have some [restrictions](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#indices-create-api-path-params) and using file names as index name, which seems to be the case, may not be a good idea. – leandrojmp Jun 08 '20 at 19:34
  • I've changed the backslash. If the gork filter looks like so - grok { match => ["path", "data\%{GREEDYDATA:index_name}" ] } it results in error. If I use two backslashes \\, I get the error described initially. I understand that it is not the best way to have the index as file name, but that's what I need for now. Thank you for sharing this info though. – Annie H. Jun 09 '20 at 15:26

1 Answers1

0

Try the following grok pattern in your filter.

grok {
    match => ["path", "C:\\%{GREEDYDATA}\\%{GREEDYDATA:index_name}.log"]
}

This will match any path that starts with C:\ and will extract the name of the file and store it in the field index_name.

For example, for a path = C:\Web-development\tests\filename001.log, the index_name will be filename001.

If any of your files have a uppercase letter, you will need to use a mutate filter to convert the index_name to lowercase, you can't have uppercase letters in an index name, if you have a space in the filename you will also need to use the mutate filter to remove the space, you can't have spaces in an index name, those are some of the restrictions

leandrojmp
  • 7,082
  • 2
  • 19
  • 24
  • Still didn't work out. Even if a set the full path. It still shows my index name as %{index_name}. Could it be a problem with %{} placeholder? My file names don't contain capital letters or spaces. – Annie H. Jun 09 '20 at 22:59
  • No, the `%{FIELDNAME}` is needed, it is the way logstash can access the content of the referenced field. Try to run your pipeline with only the `stdout` output and without any filter, and update your question with this output, so we can see how logstash is dealing with your messages. – leandrojmp Jun 09 '20 at 23:41
  • By the output you mean the one that I receive in the console?! – Annie H. Jun 10 '20 at 11:14
  • Yes, the `stdout` outputs to the console. – leandrojmp Jun 10 '20 at 13:17
  • I've added the output to the question description. – Annie H. Jun 10 '20 at 14:54