Hazelcast partition counts and thread concurrency

Question

In the Master Hazelcast eBook under "17.4.1. Partition-aware Operations" it states:

To execute partition-aware operations, an array of operation threads is created.

A single operation thread executes operations for multiple partitions;

Each partition belongs to only 1 operation thread.

Suppose I have the default number of 271 partitions on a 17-node cluster each with 16 partition threads. Distributing the partitions across the cluster, this means all partitions will have one thread associated with it and each thread will only have 1 partition (seems the optimal case to me).

Ignoring backups and near-caches, when I create an IMap instance, does this mean I can only ever have 1 concurrent put/get operation executing on every map partition across the cluster? Going further, if I attach a MapStore, does this mean I can only ever have 271 concurrent operations running against my backend database, since there is no way of making async MapStores?

My reason for asking this is that I have a highly concurrent web-application, and I recently switched the datastore to run with Hazelcast IMap in front of it. The application accepts thousands of concurrent connection and almost every single request performs at least a get operation from the distributed map. I'm seeing a lot of these errors:

com.hazelcast.core.OperationTimeoutException: No response for 20000 ms. Aborting invocation! Invocation{serviceName='hz:impl:mapService', op=com.hazelcast.map.impl.operation.GetOperation{identityHash=1003806362, serviceName='hz:impl:mapService', partitionId=244, replicaIndex=0, callId=55212219, invocationTime=1462913274676 (Tue May 10 20:47:54 UTC 2016), waitTimeout=-1, callTimeout=10000, name=..., name=...}, partitionId=244, replicaIndex=0, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeout=10000, target=Address[10.0.2.221]:5701, backupsExpected=0, backupsCompleted=0, connection=Connection [/10.0.2.219:5701 -> /10.0.2.221:14565], endpoint=Address[10.0.2.221]:5701, alive=true, type=MEMBER} No response has been received! backups-expected:0 backups-completed: 0

Could this simply be caused by the MapStore blocking the partition thread while it's trying to fetch from the database? I should also note that while it says No response for 20000 ms, 20s has not elapsed.

I'm running Hazelcast 3.6.2 on Java 8.

Did you change the hazelcast.operation.call.timeout.millis due to your 20s timeout instead of the usual 2x60seconds. — pveentjer, May 11 '16 at 14:30

pveentjer · Answer 1 · 2016-05-11T14:15:23.757

gnoring backups and near-caches, when I create an IMap instance, does this mean I can only ever have 1 concurrent put/get operation executing on every map partition across the cluster?

Correct. So it could be that partition 25 of map a and map b, is busy processing an operation for map b and therefor an operation for map a needs to wait.

Going further, if I attach a MapStore, does this mean I can only ever have 271 concurrent operations running against my backend database, since there is no way of making async MapStores?

For a write through mapstore --> yes. But I'm not that familair with writebehind (async) mapstores threading model.

Could this simply be caused by the MapStore blocking the partition thread while it's trying to fetch from the database? I should also note that while it says No response for 20000 ms, 20s has not elapsed.

That could very well be the cause.

Hazelcast partition counts and thread concurrency

1 Answers1