Transferring data through clusters using gemfire

Question

I have searched solutions for my usecase but did not get right one, so expecting some nice ideas to explore further.

I have two gemfire (version 8.2) clusters (private and public) each stores 110+ GB data without persisting to diskstore. Private cluster gets data from DB and transmits entries to public through WAN gateway until both clusters are online. I have a usecase where I restart only public cluster but it looses data after that and to populate data back I have to restart private cluster and loading data from DB to private cluster that in turn transmits data through WAN.

I can't populate public cluster from DB as it puts load onto my master DB that will affect other applications.

There are multiple solutions I tried.

First: Exporting dataset from private cluster and then importing to public; but this disconnects private cluster gemfire nodes as it stores large volume of data in each region, also I have limitation on disk space for downloading large volumes of data.

Second: There is a possibility that I will expose a JMX bean from public cluster. I then can run a client program that invokes gemfire function in private cluster which iterates through entries and drops entries into public cluster through JMX, but my organizational infrastrucure doesn't let me expose JMX beans in gemfire nodes.

Third: As like second one, gemfire function can transmits data to public cluster through queue which seems to be working but has its own limitations. Queue can only transfer text message of 1MB due to which I need to specially handle large objects and also data transfer includes unnecessary serialization and deserialization (JSON text message).

Is there anyway that I can ask private cluster to re-transmit all data through WAN gateway or any other solution someone can propose me to explore.

score 1 · Accepted Answer · answered Feb 07 '19 at 19:36

1

You can try "gemtouch" in this open source project gemfire-toolkit.

It sounds very similar to idea 2 but it doesn't require exposing a JMX bean. It does use JMX the same way gfsh does. If that's a problem you could easily remove the use of JMX as it only uses JMX for retrieving the list of regions.

answered Feb 07 '19 at 19:36

Randy May

348
1
8

Thanks Randy, In the first glance it looks promising to me, I will explore it more to find my use-case feasibility. – sanit Feb 08 '19 at 03:51

score 0 · Answer 2 · answered Nov 19 '19 at 14:02

I have the same problem but working with 3 Geode clusters (each in a different location).

When something weird happens in one the clusters, we would need to recover it using one of the existing 2 remaining clusters:

If we "touch" one of the clusters, that means that all that info will replicate to cluster that needs recovery, but also to the other cluster that is actually OK. Probably that is OK is not causing any damage, but I would appreciate any opinion.
If we keep running traffic on the remaining 2 clusters while in one of them we are running GemTouch I guess some consistency problems between cluster could pop-up, but not sure.
Last topic it is about LICENSE of gemfire-toolkit. Actually, there is no LICENSE file, so I am not 100% sure if the tool can be used.

I referred to gemtouch code and modified accordingly to my requirement and this is working fine. On your second point on consistency, I believe that will never happen in case of partition region. While you are using gemtouch, entry will be queued for WAN gateway and if same time there is any change to that record, change will be processed by server node that is holding maser copy. So in both cases duplicate records will be processed by same server node in the order. — sanit, Jan 04 '20 at 09:34

Transferring data through clusters using gemfire

2 Answers2