0

I'm using Elasticsearch 5.1.1, bulk inserting documents in my index and I need to get in the response one of the document's fields along with the autogenerated _id to update a database.

I have been trying requests like the following ones:

curl -XPOST localhost:9200/_bulk?pretty -d '
{ "index" : { "_index" : "articles_201701", "_type" : "articles_type" , "_source_include" : "db_id"} }
{ "db_id" : "value1" }
{ "index" : { "_index" : "articles_201701", "_type" : "articles_type" , "_source_include" : "db_id"} }
{ "db_id" : "value2" }
'


curl -XPOST localhost:9200/_bulk?pretty -d '
{ "index" : { "_index" : "articles_201701", "_type" : "articles_type", "fields" : ["db_id"]} }
{ "db_id" : "value1" }
{ "index" : { "_index" : "articles_201701", "_type" : "articles_type", "fields" : ["db_id"]} }
{ "db_id" : "value2" }
'

curl -XPOST 'localhost:9200/_bulk?pretty&fields=db_id' -d '
{ "index" : { "_index" : "articles_201701", "_type" : "articles_type" } }
{ "db_id" : "value1" }
{ "index" : { "_index" : "articles_201701", "_type" : "articles_type"} }
{ "db_id" : "value2" }
' 

Some of them with slight variations and combinations but no luck.

Not sure if this is even possible...

  • When use bulk to insert documents, you need to apply _index, _type and _Id, if you don't apply _id elasticsearch will generate _id for you,and in the response you will get the generate _id. Elasticsearch don't generate other fields – Lax Dec 30 '16 at 03:52
  • Thanks for your answer and sorry, maybe I didn't explain well, I do need ES to generate the _id but also I need it to return the "db_id" field (yes, the one that I index, from the document) along with the _id. This way I can perform something like `UPDATE table set elastic_id = :_id WHERE db_id = :db_id` Maybe this isn't the right approach, however, I think that querying the returned _ids (from the bulk insert) to know the db_ids is too much overhead... What I want to do is something like pgsql's `INSERT INTO table VALUES (?,?,?) RETURNING id,name` – Victor Duran Dec 30 '16 at 04:29
  • If you generate the db_id why you need elasticsearch to return it for you? – Lax Dec 30 '16 at 04:40
  • Because I need to update the database with the elasticsearch _id, I only know this after ES index the documents but, since this is a bulk insert, I get a long list with results from the individual index operations (with the _ids) and don't know which db_id pairs with which _id – Victor Duran Dec 30 '16 at 05:00
  • Create array of objects, each object contain db_Id and Id.before calling bulk set the db_id , after calling bulk set the id field with the _id that elasticsearch generate. Now when you want to update you have Id and db_id – Lax Dec 30 '16 at 05:04
  • If I'm not mistaken, the response you get from the bulk call with be in the same order that the request, i.e. the `db_id` of the eighth record in the bulk request will correspond to the generated `_id` at position 8 in the results, so the solution by @Lax would work. – Val Dec 30 '16 at 07:20
  • It is true, the response is in the same order as the request – Lax Dec 30 '16 at 07:35
  • Thanks @Val, I felt tempted by this option but I don't have the certainty that bulk results work that way. Can I trust that bulk results come in that order? Is there something in the docs that clarify that? – Victor Duran Dec 30 '16 at 14:38

1 Answers1

0

Create array of objects, each object contain db_Id and Id.before calling bulk set the db_id , after calling bulk set the id field with the _id that elasticsearch generate. Now when you want to update you have Id and db_id

Lax
  • 1,109
  • 1
  • 8
  • 13