I have over 30 million documents in Elasticsearch (version - 6.3.3), I am trying to add new field to all existing documents and setting the value to 0.
For example: I want to add start
field which does not exists previously in Twitter
document, and set it's initial value to 0, in all 30 million documents.
In my case I was able to update 4 million only. If I try to check the submitted task with TASK API http://localhost:9200/_task/{taskId}
, result from says something like ->
{
"completed": false,
"task": {
"node": "Jsecb8kBSdKLC47Q28O6Pg",
"id": 5968304,
"type": "transport",
"action": "indices:data/write/update/byquery",
"status": {
"total": 34002005,
"updated": 3618000,
"created": 0,
"deleted": 0,
"batches": 3619,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1.0,
"throttled_until_millis": 0
},
"description": "update-by-query [Twitter][tweet] updated with Script{type=inline, lang='painless', idOrCode='ctx._source.Twitter.start = 0;', options={}, params={}}",
"start_time_in_millis": 1574677050104,
"running_time_in_nanos": 466805438290,
"cancellable": true,
"headers": {}
}
}
The query I am executing against ES , is something like:
curl -XPOST "http://localhost:9200/_update_by_query?wait_for_completion=false&conflicts=proceed" -H 'Content-Type: application/json' -d'
{
"script": {
"source": "ctx._source.Twitter.start = 0;"
},
"query": {
"exists": {
"field": "Twitter"
}
}
}'
Any suggestions would be great, thanks