0

I would like to integrate ElasticSearch with external system, over http/rest, for boosting score. I'm not ElasticSearch developer and I don't have too much experience with ElasticSearch.

I can use Native Script for that, but there is a problem with performance. Because for each document I need to call external system. What I would like to see is some kind of batch processing.

I don't want to store information from external system in ElasticSearch, because they could change in time.

Could you please advise me, how to do that?

Also I didn't find much information (documentation) about custom native scripts or plugins. Just project on GitHub.

MicTech
  • 42,457
  • 14
  • 62
  • 79

1 Answers1

0

Typically you would index these volatile attributes as Elasticsearch document fields and build a process to sync changes from your external master.

If the cost of reindexing your documents becomes prohibitive, look into creating a Parent-Child relationship between a parent entity containing the static document fields and a child entity containing the more volatile fields.

This way, updates to the volatile, child fields will not require reindexing of the more static parent documents.

Peter Dixon-Moses
  • 3,169
  • 14
  • 18
  • Thank you for your answer, but I looking for an option to call external system and don't store any data from external system in ElasticSearch. – MicTech Aug 17 '15 at 08:35
  • There's not a good solution. The reason is that the low-level Lucene Scorer needs to iterate over all documents matching your query (could be 100,000 or millions in some settings) to compute a score. Imagine the latency that could be incurred calling out to an external datasource 100,000 times while waiting for a query to return. This is why the information needed for scoring is typically mirrored into the index and kept in sync with the master. – Peter Dixon-Moses Aug 18 '15 at 17:25
  • It sounds like you're preferred workflow would be (1) get a list of Elasticsearch doc ids matching your query, (2) go to an external system and retrieve scoring attributes for those doc ids, (3) use those scoring attributes to score documents inside Elasticsearch. I think you're stuck doing some custom post-processing (outside of Elasticsearch) for this if you don't want to store data in Elasticsearch. – Peter Dixon-Moses Aug 18 '15 at 17:30