1

Our service is written in Java and uses Elasticsearch Java client version 7.10.1.

We are using an ingest pipeline called "date index name processor" in order to determine our indexes names. We have 2 types of events: index and update.

  1. index event: this event should index a new document to Elasticsearch. Each index message contains a field that represents the message start time. The pipeline uses this field's value, concating a fixed prefix and determine the index name that this message should be indexed to. If this index already exists it will index the message to it, if it doesn't it will create a new index with this name (each index should contain thousands of messages). The ID is also a part of the message.

  2. Update event: this event is an update to an existing document. Each update message contains both message start time and ID fields (same as the index event).

The problem is that when I receive an update event, I have no idea what is the index name of the document I wish to update. Therefore, I would like to use the same ingest pipeline with the same logic, the only difference is that instead of index a message I wish to update an existing document. Moreover, as I mentioned, the update event contains the ID of the message, which can help me to find it in its index.

It sounds trivial, but for some reason, I can't find a way to use a pipeline in an update request.

Does anyone know how this can be solved?

Roy Leibovitz
  • 579
  • 5
  • 16
  • 1
    you write "Each update message contains both message start time and ID fields" and also "this field's value, concating a fixed prefix and determine the index name that this message should be indexed to". doesnt the date value determine where the document should go? – Tom Elias Jul 14 '21 at 15:16
  • The ingest pipeline perform some calculation consider this value. Basically, an index will be created for each week. I can do this calculation by my self, but I wish to avoid it, this is why I am using the pipeline. – Roy Leibovitz Jul 15 '21 at 16:49
  • something else i don't quite get, you send an update (POST?) request to a specific document that is already inside an index. why are you recalculating the destination index again? does the timestamp change? – Tom Elias Jul 18 '21 at 07:57
  • First, I am using Elasticsearch Java client, which is probably using POST in the background. Second, a very reasonable scenario is that I receive an update event few months after the document is indexed. The index for this document is calculated by the ingest pipeline as I mentioned. Therefore, when the update event arrives, I have to calculate somehow on which index this document is located. I can calculate it by myself by the best solution is to use the same ingest pipeline logic, which doesn't seem to be possible. – Roy Leibovitz Jul 19 '21 at 09:59

0 Answers0