How partitioning works in Hazelcast?

Question

From what I have seen the applications of Hazelcast are most commonly found in architectures with more than 50 nodes. Does it makes sense to use Hazelcast on a 1 up to a 4 node architecture? If yes what is the best strategy that I should follow regarding the partitions and the hazelcast instances.

Let's say that I am using Hazelcast on only one node! How many hazelcast instances should I use? and should I let the partitions default number as it is (271) ,or if it is better to change it what are the factors that I need to consider for taking my decision.

Please someone enlighten me on this one.

score 7 · Accepted Answer · answered Apr 30 '14 at 12:43

7

It depends your needs :)

What hazelcast does for e.g. a map is the calculate the hash of a key, do a mod partition count on it and that determines the partition the key will be stored on.

For a single node setup Hazelcast could make sense because the map provides more than a regular map, but its true value is with a multinode (2 or more).

We have customers that run with smaller clusters e.g. 5/6 nodes. Even with 2 nodes you still get certain features.

About the partition count: we aim that the partition size should be max 50/100 mb. So with 271 partitions you get +/13 gigs of data (50mb). If you would have 26 gigs of data, double the number of partitions.

answered Apr 30 '14 at 12:43

pveentjer

10,545
3
23
40

First of all, thank you very much for your input. The case here is that I am going to use hazelcast as a distributed in memory data grid in conjunction with couchbase as the persistent storage. So I am trying to figure out if this combination (hazelcast on 4 nodes and couchbase on 30 something) has something to offer. I have done some metrics with MapStore with reading/writing throughput and even though hazelcast is fast couchbase is even faster. The needs are very big (terabytes of data). – maria_k Apr 30 '14 at 14:37
1

We only provide an interface MapStore, so the actual implementation will be a big part of the total performance for the map. Another important factor is the configuration, e.g. write through vs write behind. You could use a map with a sync backup (so guaranteed in memory on at least another machine), but with a writebehind mapstore. So one of the members is going to do the write.. This way you can get more out of your mapstore performance. – pveentjer Apr 30 '14 at 18:59
Thanks again. One final question:I have a couchbase connection instance.Do the partitions use this instance independently and write concurrently or do they wait for one partition to finish in order for the next to continue? – maria_k May 02 '14 at 16:48
1

You are talking about Hazelcast partitions? Partitions don't do anything out of themselves. – pveentjer May 02 '14 at 19:02
Is there any way to configure the number of threads used by MapStore in order to write concurrently in the database? (something like connection pooling?) How exactly MapStore works? I have implemented it but except from the (write-delay-seconds) I haven't found any other property in order to adjust configuration better to my needs. – maria_k May 03 '14 at 10:23
I've studied a lot hazelcast and I've seen that it provides many features regarding the in memory processing, but the synchronous write in the couchbase db is a prerequisite, and I haven't seen many configuration properties related to MapStore that is way I am asking all these questions to see if I am missing anything. Imagine that there will be more than one MapStore implementations this is why I want to use it as much wisely as I can. – maria_k May 03 '14 at 10:54

How partitioning works in Hazelcast?

1 Answers1