Region.getAll(keys)
is a sequential operation, iterating over each key in the provided Collection individually and fetching the value from the Region. If you trace through the source code from Region.getAll(keys)
, you eventually will arrive here.
If your Region is a PARTITION
Region (highly recommended), you could take advantage of Geode's parallel Function Execution, something like...
Region<?, ?> myPartitionRegion = ...
...
Set<KEY> keysOfInterests = ...
...
Execution functionExecution = FunctionService.onRegion(myPartitionRegion)
.withFilter(keysOfInterests)
.withArgs(...);
ResultCollector<?, ?> results = functionExecution.execute("myFunctionId");
// process the results.
Then your Function implementation...
class MyOnRegionFunction extends FunctionAdapter {
public void execute(FunctionContext context) {
assert context instanceOf RegionFunctionContext :
String.format("This Function [%s] must be executed on a Region",
getId());
RegionFunctionContext regionContext = (RegionFunctionContext) context;
Region<K, V> localData =
PartitionRegionHelper.getLocalDataForContext(regionContext);
Map<K, V> results = localData.getAll(regionContext.getFilter());
// do whatever with results; e.g. send back to caller...
}
}
When you set a "Filter
" on the Execution
, which is a Set of Keys used to "route" the Function execution to the data nodes in the cluster containing those "keys", then in effect, you have (somewhat) parallelized the getAll
operation (well, to the extent that only keys on that node are part of the Filter in that "context", i.e. this).
There is perhaps a better, more complete example of this here. See section, "Write the Function Code".
You should probably also read up on "How Function Execution Works" and on PARTITION Regions. Also pay attention to this...
An application needs to perform an operation on the data associated with a key. A registered server-side function can retrieve the data, operate on it, and put it back, with all processing performed locally to the server.
Which is the first bullet on this page.
You can even associate a CacheLoader
to the "logical" PARTITION Region, and when the fetch is made inside the Function, and the data is not available, the loader will (should) operate locally to that node since it is only fetching KEYS that would go to that node anyway (based on the partition strategy (bucket "hash" by default)).
I have not tried the later, but I don't see why that would not work off the top of my head.
Anyway, hope this helps!
-John