0

I have an index 'event' with a couple of million entries. Every 'event' has a nested 'type' property, consisting of id, name and color. Types existing just 10 in the system. Problem/unattractive: If I change one of the type properties I have to adjust a couple of million events in the worst case. Now I was wondering if there is some possibility to set the values on the fly.

Here is what I mean: I just store the type-id in the event and provide a dictionary with the values I want to query (id -> name, color) in the request.

Important: I need to keep features like aggregations and highlighting.

core
  • 851
  • 1
  • 8
  • 28
  • The question is not understandable. Can you elaborate with concrete examples of the docs and of the values you want to set? Generally speaking, queries are for ... querying, not for value setting. – Joe - GMapsBook.com Sep 22 '20 at 08:45
  • That is exactly the problem: I don't know where/how to find an answer in the docs. I need an hint where to start with further steps. – core Sep 22 '20 at 08:47
  • "Generally speaking, queries are for ... querying, not for value setting." And what about script fields? My question goes in this direction but script fields don't help me. – core Sep 22 '20 at 08:58
  • Oh, so you want to send your query and pass some metadata along that wouldn't be used in the query but simply added to each doc? – Joe - GMapsBook.com Sep 22 '20 at 08:59
  • Generally I want to prevent store millions of same data in my index. I just wondering if it is possible to query that data anyway by providing an dictionary with the current values (metadata is the correct term maybe. or virtual fields or aliases?) Dictionary could be: Key = 1 (Id), Values: name 'foo', color 'red'. If I query for type name 'foo' I will find the superior event. I know that sounds like rocket science, so I am not sure if there are some possibilities available. – core Sep 22 '20 at 09:34
  • Yea that sounds like rocket science -- but more like the Space Shuttle Challenger Disaster . Why don't you want to store millions of docs? Also, the power of ES comes from storing & efficiently indexing data so your search gets fast... – Joe - GMapsBook.com Sep 22 '20 at 10:12
  • Yes searching is fast but updating not. And I need to store million of docs! Each doc is an event and each event has a type. Let's assume I have 2 million events of type 'meeting'. This name needs to be queryable! But suddenly the super user decides to rename this event type to 'conference'. Now I need to reindex 2 million of docs, this is what takes very long (even with by update by query). And therefore my question, because I have more such places and the data is growing fast. If you have any other advice to handle this, I would also go without a Space Shuttle Challenger Disaster ;-) – core Sep 22 '20 at 11:34

1 Answers1

0

A continuation of the discussion in the comments:

The thing is, not being able to easily update a large collection is one of the trade-offs of NoSQL like elasticsearch. You could use synonym filters for this like discussed here but you're still stuck with having to reindex (that's probably faster than _update_by_query though).

Correction: You won't have to reindex. Closing the index, setting the new synonyms, and re-opening it should work too.

Joe - GMapsBook.com
  • 15,787
  • 4
  • 23
  • 68