I recently upgraded ES from version 1.1.0 to 1.6. All the configs remained same but almost 5% of the calls started throwing EsRejectedExecutionException
which never happened when I was running ES 1.1.0
The full stack trace is below:
org.elasticsearch.action.search.ReduceSearchPhaseException: Failed to execute phase [merge], [reduce]
at org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction$1.onFailure(TransportSearchQueryAndFetchAction.java:93)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.onRejection(AbstractRunnable.java:65)
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:85)
at org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.moveToSecondPhase(TransportSearchQueryAndFetchAction.java:78)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(TransportSearchTypeAction.java:403)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:202)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onResult(TransportSearchTypeAction.java:178)
at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onResult(TransportSearchTypeAction.java:175)
at org.elasticsearch.search.action.SearchServiceTransportAction$12.handleResponse(SearchServiceTransportAction.java:346)
at org.elasticsearch.search.action.SearchServiceTransportAction$12.handleResponse(SearchServiceTransportAction.java:337)
at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:163)
at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:132)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution (queue capacity 500) on org.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction$1@4cd69494
at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:62)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:79)
All configurations remained the same while upgrading, so I am wondering what might be causing the issue. The thread pool size seems to be full, so I am trying to test by increasing the thread pool and Queue_size. But I am not able to understand what changed from 1.1 to 1.6 that is causing the issue.
I read the release notes and didn't find anything related to this. I was searching for any default value changes for any config parameters, but I didn't find anything which seems related.
I found this issue which is very similar to my issue, but this was resolved in version 1.1.2, so this shouldn't be causing the error.
It will be a great help if someone could give me pointers on how to go about finding the issue.