0

Here's what I want, it's a bit the opposite of incremental data.

some data's are logs with a specific token, and I want to be able to keep (or to show in Elasticsearch) only the first submitted data, the oldiest information of each token.

I want to ignore any new log of the same token ?

How can I do that ? is it in logstash or elasticsearch ?

Thanks

Updates 2016-05-31

I think we can see that in different perspective. but globally what I want is the table like in the picture, but without the red lines, I want them to be ignored by logstash, or not display in ES queries. enter image description here

I know it can be done, if I was able to add any flag in those lines I want to delete, but it's not possible, the only fact that tell us they can be removed is because we already have a key first-AAA that has been logged before. At the logging process, we don't have this information.

Yoni Elyo
  • 487
  • 6
  • 23

1 Answers1

1

You can achieve this using the elasticsearch filter. The filter would check in ES if the record already exists and if it is the case, we ask Logstash to just drop the line.

Note that I'm making the assumption that the Id field (AAA) is used as the document _id and is also present in the document as the Id field. Feel free to change whatever needs to, but this will work.

input {
   ...
}
filter {
   elasticsearch {
      hosts => ["localhost:9200"]
      query => "_type:your_type AND _id:%{[Id]}"
      fields => {"Id" => "found"}
   }
   if [found] {
      drop {}
   }
}
output {
   elasticsearch {
      hosts => ["localhost:9200"]
      ...
   }
}
Val
  • 207,596
  • 13
  • 358
  • 360