Reading between the lines in your problem statement, it seems you are:
- Executing an OQL query on a
PARTITION
Region (PR).
- Running the query inside a
Function
as recommended when executing queries on a PR.
- Sending batch results (as opposed to streaming the results).
I also assume since you posted exclusively in the #spring-data-gemfire channel, that you are using Spring Data GemFire (SDG) to:
- Execute the query (e.g. by using the SDG
GemfireTemplate
; Of course, I suppose you could also be using the GemFire Query API inside your Function directly, too)?
- Implemented the server-side Function using SDG's Function annotation support?
- And, are possibly (indirectly) using SDG's
BatchingResultSender
, as described in the documentation?
NOTE: The default batch size in SDG is 0
, NOT 100
. Zero means stream the results individually.
Regarding #2 & #3, your implementation might look something like the following:
@Component
class MyApplicationFunctions {
@GemfireFunction(id = "MyFunction", batchSize = "1000")
public List<SomeApplicationType> myFunction(FunctionContext functionContext) {
RegionFunctionContext regionFunctionContext =
(RegionFunctionContext) functionContext;
Region<?, ?> region = regionFunctionContext.getDataSet();
if (PartitionRegionHelper.isPartitionRegion(region)) {
region = PartitionRegionHelper.getLocalDataForContext(regionFunctionContext);
}
GemfireTemplate template = new GemfireTemplate(region);
String OQL = "...";
SelectResults<?> results = template.query(OQL); // or `template.find(OQL, args);`
List<SomeApplicationType> list = ...;
// process results, convert to SomeApplicationType, add to list
return list;
}
}
NOTE: Since you are most likely executing this Function "on Region", the FunctionContext
type will actually be a RegionFunctionContext
in this case.
The batchSize
attribute on the SDG @GemfireFunction
annotation (used for Function
"implementations") allows you to control the batch size.
Of course, instead of using SDG's GemfireTemplate
to execute queries, you can, of course, use the GemFire Query API directly, as mentioned above.
If you need even more fine grained control over "result sending", then you can simply "inject" the ResultSender
provided by GemFire to the Function
, even if the Function
is implemented using SDG, as shown above. For example you can do:
@Component
class MyApplicationFunctions {
@GemfireFunction(id = "MyFunction")
public void myFunction(FunctionContext functionContext, ResultSender resultSender) {
...
SelectResults<?> results = ...;
// now process the results and use the `resultSender` directly
}
}
This allows you to "send" the results however you see fit, as required by your application.
You can batch/chunk results, stream, whatever.
Although, you should be mindful of the "receiving" side in this case!
The 1 thing that might not be apparent to the average GemFire user is that GemFire's default ResultCollector
implementation collects "all" the results first before returning them to the application. This means the receiving side does not support streaming or batching/chunking of the results, allowing them to be processed immediately when the server sends the results (either streamed, batched/chunked, or otherwise).
Once again, SDG helps you out here since you can provide a custom ResultCollector
on the Function
"execution" (client-side), for example:
@OnRegion("SomePartitionRegion", resultCollector="myResultCollector")
interface MyApplicationFunctionExecution {
void myFunction();
}
In your Spring configuration, you would then have:
@Configuration
class ApplicationGemFireConfiguration {
@Bean
ResultCollector myResultCollector() {
return ...;
}
}
Your "custom" ResultCollector
could return results as a stream, a batch/chunk at a time, etc.
In fact, I have prototyped a "streaming" ResultCollector
implementation that will eventually be added to SDG, here.
Anyway, this should give you some ideas on how to handle the performance problem you seem to be experiencing. 1000 results is not a lot of data so I suspect your problem is mostly self-inflicted.
Hope this helps!