0

I have a clustered system set up with Hazelcast to store my data. Each node in the cluster is responsible for connecting to a service on localhost and piping data from this service into the Hazelcast cluster.

I would like this data to be stored primarily on the node that received it, and also processed on that node. I'd like the data to be readable and writable on other nodes with moderately less performance requirements.

I started with a naive implementation that does exactly as I described with no special considerations. I noticed performance suffered quite a bit (we had a separate implementation using Infinispan to compare it with). Generally speaking, there is little logical intersection between the data I'm processing from each individual service. It's stored in a Hazelcast cluster so it can be read and occasionally written from all nodes and for failover scenarios. I still need to read the last good state of the failed node if either the Hazelcast member fails on that node or the local service fails on that node.

So my first attempt at co-locating the data and reducing network chatter was to key much of the data with a serverId (number from 1 to 3 on, say, a 3-node system) and include this in the key. The key then implements PartitionAware. I didn't notice an improvement in performance so I decided to execute the logic itself on the cluster and key it the same way (with a PartitionAware/Runnable submitted to a DurableExecutorService). I figured if I couldn't select which member the logic could be processed on, I could at least execute it on the same member consistently and co-located with the data.

That made performance even worse as all data and all execution tasks were being stored and run on a single node. I figured this meant node #1 was getting partitions 1 to 90, node #2 was getting 91 to 180, and node #3 was getting 181 to 271 (or some variant of this without complete knowledge of the key hash algorithm and exactly how my int serverId translates to a partition number). So hashing serverId 1, 2, 3 and resulted in e.g. the oldest member getting all the data and execution tasks.

My next attempt was to set backup count to (member count) - 1 and enable backup reads. That improved things a little.

I then looked into ReplicatedMap but it doesn't support indexing or predicates. One of my motivations to moving to Hazelcast was its more comprehensive support (and, from what I've seen, better performance) for indexing and querying map data.

I'm not convinced any of these are the right approaches (especially since mapping 3 node numbers to partition numbers doesn't match up to how partitions were intended to be used). Is there anything else I can look at that would provide this kind of layout, with one member being a preferred primary for data and still having readable backups on 1 or more other members after failure?

Thanks!

1 Answers1

0

Data grids provide scalability, you can add or remove storage nodes to adjust capacity, and for this to work the grid needs to be able to rebalance the data load. Rebalancing means moving some of the data from one place to another. So as a general rule, the placement of data is out of your control and may change while the grid runs.

Partition awareness will keep related items together, if they move they move together. A runnable/callable accessing both can satisfy this from the one JVM, so will be more efficient.

There are two possible improvements if you really need data local to a particular node, read-backup-data or near-cache. See this answer. Both or either will help reads, but not writes.

Neil Stevenson
  • 3,060
  • 9
  • 11
  • The problem with our integration with these standalone services is that there's a fair amount of communication back to the service. So even if I manage to group all the data together, sending it off for processing to the owner node would still involve a fair amount of communication back to this one node. Keeping the data and processing it local to the service it's connected to cuts down on a huge amount of network traffic. – Scott Van Wart Oct 23 '18 at 11:20
  • I think I'll look into near-cache, but I'm not sure exactly what "limited" means for TransactionalMap support. – Scott Van Wart Oct 23 '18 at 12:22
  • Is the partition assignment pluggable? Can I replace or supplant the assignment/rebalance logic to influence how partitions are assigned? I won't break anything, I promise. – Scott Van Wart Oct 23 '18 at 17:47
  • It's not pluggable, sorry. – Neil Stevenson Oct 25 '18 at 07:42