0

I have the logstash configuration file below, which is used to index a database (in this case just a csv file). Every time I call logstash with this configuration it adds to the existing elasticsearch index, thus producing duplicates. I really want it to overwrite the existing index. I realize I can probably do this with two configuration calls one with action=>"delete" and the other with action=>"index", but it seems like I should be able to do this in one step. Also, it's not clear from the documentation if I can use upsert for this. (Also, notice I'm using the stdin option which means logstash exits once the document is indexed, it doesn't continue to watch the document for changes) Thanks for any help.

input {
     stdin {}
}
filter {
     csv { 
        columns => [a,b,c,d]
        separator=> ","
    }
}

output {
    elasticsearch {
        hosts => ["localhost:9200"] 
        index => "database.csv"
    }
}
user3071643
  • 1,405
  • 1
  • 15
  • 18
  • 1
    Ok, I found a similar post here http://stackoverflow.com/questions/21716002/importing-and-updating-data-in-elasticsearch/21738549#21738549 – user3071643 Mar 24 '16 at 19:30

1 Answers1

3

If you have (or can compute) an id from your csv, you could just do this :

output {
    elasticsearch {
        hosts => ["localhost:9200"] 
        index => "database.csv"
        document_id => "%{yourComputedId}"
    }
}

Then every time you try to index the same id, it'll be updated in the es index.

maximede
  • 1,763
  • 14
  • 24