1

We need to load millions of key/values into Apache Geode and we'd like to know what are some the options available. Our values happen to be in the 256kb range.

Newbie
  • 7,031
  • 9
  • 60
  • 85

2 Answers2

2

There are several options depending on your application requirements/SLAs or whether you need to perform conversion or other transformations, etc.

  1. Out-of-the-box, Apache Geode provides the Cache & Region Snapshot Service. This is useful when you want to migrate data from 1 existing Apache Geode cluster to another, for instance. Not so useful if your data is coming from an external source, like a RDBMS.

  2. Another option is to lazily load the data based on need. This can be accomplished by implementing the CacheLoader interface and registering the CacheLoader with a Region. Obviously, you could create a CacheLoader implementation that intelligently loads a block of data based on some rules/criteria in addition to loading and returning the single value of interests based on the current requests.

  3. A lot of times, users create an external, custom Conversion process or tool to extract, transform and bulk load (ETL) a bunch of data into Apache Geode. This is typical in complex Use Cases or requirements. However, it is highly advisable to use perhaps a framework/tool like...

  4. Spring XD (now Spring Cloud Data Flow on Pivotal's Cloud Foundry (PCF)) is great ETL tool and pipeline for creating stream-based applications. Spring XD / SCDF provides many different options for "sources" and "sinks" (e.g. GemFire Server). In addition to sources & sinks, you can even "tap" the stream to process the data with "Processors". So whether you are doing real-time stream or batch-oriented data operations (e.g. bulk loads), Spring XD is a great option.

  5. I am sure Google might provide other answers on how to perform ETL with a KeyValue store like Apache Geode.

Hope this helps get you going.

Cheers, John

John Blum
  • 7,381
  • 1
  • 20
  • 30
1

We have very limited options to load Gemfire regions .

1) Spring batch:

  • Create Gemfire writer for load data and remove data
  • Create batch configuration and lod it

2) Apache Spark

vaquar khan
  • 10,864
  • 5
  • 72
  • 96