7

I am trying to update a document's nested type field using update_by_query. I am using the following script query:

POST test/_update_by_query
{
  "script": {
    "source": "ctx._source.address = params.address",
    "params": {
              "address": [{"city":"Mumbai"}]
    }
  },
  "query": {
    "bool": {
      "must": [
        {
                        "term": {
                            "uid": "b123"
                        }
                    }
      ]
    }
  }
}

But I am getting the following error:

version conflict, required seqNo [607], primary term [16]. current document has seqNo [608] and primary term [16]

What is the reason for this issue and How I can fix this? Instead of _update_by_query can I use any other query here? Please help me here

Suraj Dalvi
  • 988
  • 1
  • 20
  • 34

3 Answers3

8

Update by query takes a snapshot of the data and then updates each matching document. This error means that the document has been updated by another process after your update by query call started running...

You can choose to ignore those conflicts issues, by doing this:

POST test/_update_by_query?conflicts=proceed

In the response, you're going to have an indication of how many documents were in conflict and you can run the update by query again to pick them up if desired.

Update:

If you need to update only a single document and you know its ID, then you don't need to use update by query, but simply the update endpoint. The big advantage is that the update endpoint has a parameter called retry_on_conflict which will retry the operation in case of conflicts, so that you can be sure that the document is eventually updated when the call returns:

POST test/_doc/123/_update?retry_on_conflict=3
{
  "doc": {
    "address": [{"city":"Mumbai"}]
  }
}
Val
  • 207,596
  • 13
  • 358
  • 360
  • as my condition, I am updating only one document here. Is any option to solve this error? – Suraj Dalvi Aug 20 '20 at 12:49
  • Do you know the id of the document you want to update? – Val Aug 20 '20 at 12:58
  • is update_by_query not support retry_on_conflict option? – Suraj Dalvi Aug 20 '20 at 13:08
  • No, it doesn't support that parameter – Val Aug 20 '20 at 13:09
  • 1
    so in general when we know the id then it is a better option to use update query instead of update_by_query. but when we don't know the id then we can use update_by_query to update multiple documents right? – Suraj Dalvi Aug 20 '20 at 13:13
  • and I think update query is faster than update_by_query – Suraj Dalvi Aug 20 '20 at 13:14
  • Yes that's correct, and less expensive because it doesn't do a query and directly fetches the document by id – Val Aug 20 '20 at 13:14
  • okay, thanks will try this with an update query (when we know id) – Suraj Dalvi Aug 20 '20 at 13:16
  • it is working. But I have another case in which I am updating more than one documents so I am using update_by_query there but I am getting the same version conflict error: version conflict, required seqNo [607], primary term [16]. current document has seqNo [608] and primary term [16]. I am updating multiple documents here so can't able to use update query. SO How We can handle this case? is any option here? – Suraj Dalvi Aug 21 '20 at 15:17
  • Conflicts arise when multiple clients update the same document, so first you need to figure out what other processes are trying to update the same document – Val Aug 21 '20 at 15:19
  • actually error is coming randomly I have added one level retry there. means when I found status code 409 I am sending request one more time but still version conflict error coming randomly. – Suraj Dalvi Aug 21 '20 at 17:02
  • update_by_query conflicts=proceed option updates the document if conflicts come or it just ignores that document? – Suraj Dalvi Aug 22 '20 at 12:37
  • I really don't understand how I can deal with this. even I don't know the steps to reproduce this.any suggestion here? – Suraj Dalvi Aug 22 '20 at 12:47
  • sorry to disturb you again I have to update two documents for that I am using the update_by_query option and getting the version_conflict but what if I update each document separately by using the update query? Does this solve my problem – Suraj Dalvi Aug 22 '20 at 14:12
  • 1
    @Val setting refresh=True worked with _update_by_query. Do you think it has any caveats ? – Sandeep Balagopal May 20 '22 at 13:04
  • 1
    @SandeepBalagopal please create a new question with your exact use case and what is causing you issues – Val May 20 '22 at 13:12
  • 1
    @Val my question is not a different one. If i create that will be a duplicate. _update_by_query gives me conflict error when same doc appeared in a loop. when I set refresh=True like suggested in the two answers above the update was successful there were no errors. But your answer did not mention about it I just wonder whether its because it has a caveat ? – Sandeep Balagopal May 20 '22 at 13:21
5

You can use refresh=True in the argument of your query.

Prashant
  • 51
  • 1
  • 3
  • 3
    Surprisingly it worked but can anyone tell me the reasoning behind it, i am not able to infer – tempEngineer Jul 14 '22 at 13:14
  • @tempEngineer because you are refreshing the index so the changes are visible to other operations and not waiting to the scheduled refresh – Rafael May 13 '23 at 13:00
2

I had the same problem, needed two queries to execute one after another on the same index using refresh=true solved my problem

https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/update_by_query_examples.html

        await elasticWrapper.client.updateByQuery({
          index: ElasticIndex.Customer,
          refresh: true,
          body: {
            query: {
              bool: {
                must: [
                  {
                    match: {
                      'user.id': id,
                    },
                  },
                ],
              },
            },
            script: {
              source: `ctx._source.user=params.user`,
              lang: 'painless',
              params: {
                user: { id, name: fullName, thumbnail },
              },
            },
          },
        });

        await elasticWrapper.client.updateByQuery({
          index: ElasticIndex.Customer,
          refresh: true,
          body: {
            query: {
              nested: {
                path: 'tasks',
                query: {
                  bool: {
                    must: [
                      {
                        exists: {
                          field: 'tasks.assignee',
                        },
                      },
                      {
                        term: {
                          'tasks.assignee.id': id,
                        },
                      },
                    ],
                  },
                },
              },
            },
            script: {
              source: `for (int i = 0; i < ctx._source.tasks.size();i++){
                if(ctx._source.tasks[i].assignee.id == params.id){
                  ctx._source.tasks[i].assignee.thumbnail = params.thumbnail
                }
               }`,
              lang: 'painless',
              params: {
                id,
                thumbnail,
              },
            },
          },
        });
      }
Rafiq
  • 8,987
  • 4
  • 35
  • 35