I have a problem in which I need to query for a subset of records on a large index containing a high volume of records, whilst running a Painless script with the search query to augment the result. The (much smaller) result is to be saved in a secondary index for later use. In a different SO question: Reindex part of Elasticsearch index onto new index via Jest, I mentioned this is possible through the Kibana interface, but there does not seem to be a Java library that can accomplish what I need. Has anyone ever accomplished a query within a _reindex operation outside of Kibana? I am leaning toward using the URLConnection family in Java, but am looking for suggestions and advice at this point.
Asked
Active
Viewed 259 times
0
-
Do you must to use Java? Becouse if not, you can use any rest client with "Reindex by query" query. This query support Painless script like this: – Or you can use Cerebro application as UI for Elasticsearch managementVakhtang Jul 07 '18 at 19:27
-
I need to use Java to interface with the ES cluster, so, yes, I am somewhat tied to the language. Also, I am using a Jest client (due to AWS not allowing transport client access), and the closest thing I can see to what you are describing is the Reindex.Builder() method, which I am having trouble executing painless scripts with. – BPS Jul 07 '18 at 19:47
-
Essentially I want to do what you are describing, POST a _reindex query that looks something like this: POST _reindex?slices=10&wait_for_completion=false { "conflicts": "proceed", "source":{ "index": "my_source_index", "size": 5000, "query": { "bool": { "filter": { "bool" : { "must" : [... ]}}}} ]}} }} }, "dest": { "index": "my_new_temp_index" }, "script": { "source": "
" "lang": "painless", "params" : { ... } } } Can I do that as a payload in HttpUrlConnection ? – BPS Jul 07 '18 at 20:27