0

Here is a sample of bulk insertion provided by elastic search docs at: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

POST _bulk
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_type" : "type1", "_id" : "2" } }
{ "create" : { "_index" : "test", "_type" : "type1", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }

They mentioned that "Because this format uses literal \n's as delimiters, please be sure that the JSON actions and sources are not pretty printed".

I would like to know the reason behind such input format and why did they not choose an array of JSON objects instead.

For example something:

POST _bulk
    [{{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
    { "field1" : "value1" }},
    { "delete" : { "_index" : "test", "_type" : "type1", "_id" : "2" } }
    { "create" : { "_index" : "test", "_type" : "type1", "_id" : "3" } }
    { "field1" : "value3" }
    { "update" : {"_id" : "1", "_type" : "type1", "_index" : "test"} }
    { "doc" : {"field2" : "value2"} }]

The above structure is not correct but something like that Is it something common that I am missing, in a REST API development standards? Delimiters instead of an array?

TGW
  • 805
  • 10
  • 27

1 Answers1

1

That allows the Bulk endpoint to process the body one/two line after another. If it was a JSON array, ES would have to load and parse the whole JSON body into memory in order to extract one array element after another.

Knowing that the bulk body can be pretty large (i.e. hundreds of MB), this was an optimisation to prevent your ES server from crashing when sending huge bulk requests.

Val
  • 207,596
  • 13
  • 358
  • 360
  • Thanks, @Val that really helped are there any other such APIs you came across? Are there any more such ways to process bulk inputs? I can not upvote as I do not have enough reputation points, but am glad to have this as an acceptable answer. – TGW Sep 11 '17 at 16:54
  • cool thanks would definitely try to implement something similar if I am to wrap this API :) – TGW Sep 11 '17 at 17:35