My Netty based app
- ingests hundreds of thousands of messages per second from a single TCP connection
- processes those messages in a couple of inbound handlers
- sends processing results somewhere downstream
Currently, all this is running on thread, as it is on a single TCP connection. I would like to know how I can parallelize 2. The difficulty is that messages cannot just be processed in parallel nilly-willy, because there is a partial order of messages. You can think of this as there being a key(message)
function, and all messages for which this function returns the same result need to processed sequentially, but if the results are different, they may run in parallel. So I am thinking of having a mapping from message to thread like hash(key(message)) % threadCount
.
Imagine this pipeline:
pipeline.addLast(deframer);
pipeline.addLast(new IdleStateHandler(...));
pipeline.addLast(decoder);
pipeline.addLast(bizLogicHandler1);
pipeline.addLast(bizLogicHandler2);
In the decoder, I am able to compute the result of key(message)
, so I would like to parallelize everything downstream of decoder. It is documented that in order to use multiple threads I can do
static final EventExecutorGroup group = new DefaultEventExecutorGroup(16);
...
pipeline.addLast(group, "bizLogicHandler1", bizLogicHandler1);
pipeline.addLast("bizLogicHandler2", bizLogicHandler2);
which I guess means bizLogicHandler1 and everything below it (in the above example that would be bizLogicHandler2) will be able to run in parallel? (Or would I have to specify group
for bizLogicHandler2 as well?)
The above will however still run completely serial, as the documentation explains, offering UnorderedThreadPoolEventExecutor
as an alternative to maximize parallelism, at the cost of getting rid of ordering completely, which does not work in my case.
Looking at the interfaces EventExecutorGroup
and EventExecutor
, I don't see how it would be possible to convey which messages can be processed in parallel, and which must be processed sequentially.
Any idea?