0

I am using elasticsearch-dsl to construct 2 complex queries and execute them over multiple indices at the same time with MultiSearch. It is working fine in 99% of cases but rarely fails for some input parameters with such an error printed on backend (Django):

2022-09-29 03:37:23,592 ERROR:  https://aws_instance_name.us-east-1.es.amazonaws.com:443

For such input params it never is executed successfully, so its definitely dependent upon the inputs. The backend Django python code executes fine until the response is actually returned which I observe with logger.info statements, but the error is there. When I set the response to an empty one to exclude issues further down the line with the specificity of the response, it still fails, so it's as if it's not the issue of the response. I executed both queries separately (on backend they are executed together in one with MultiSearch, so it could potentially be some issue inside the MultiSearch itself) over the very same indices in AWS Kibana and both are returning results fine in a matter of a couple of seconds. I wonder if there is a way to print out detailed info about the issue and not just ERROR:aws_instance message that is useless. What else could go wrong here?

Right now the only 2 ideas that I have is substitute python lists with numpy arrays to avoid any potential memory errors (however no MemoryError is printed, so most probably that is not the issue) or substitute MultiSearch with single Search in elasticsearch-dsl and try whether this would work. Any suggestions would be greatly appreciated. I don't put code here because it is not clear what is going wrong in the first place and which part of code has the issue and the code base is huge. However, here you can find MultiSearch construction code as well as the rest:

https://github.com/broadinstitute/seqr/blob/master/seqr/utils/elasticsearch/es_search.py#L655

Nikita Vlasenko
  • 4,004
  • 7
  • 47
  • 87

0 Answers0