Querying GemFire Region by partial key

Question

When the key is a composite of id1, id2 in a GemFire Region and the Region is partitioned with id1, what is the best way of getting all the rows whose key matched id1.

Couple of options that we are thinking of:

Create another index on id1. If we do that, we are wondering if it goes against all Partition Regions?
Write data aware Function and Filter by (id1, null) to target specific Partition Region. Use index in local Region by using QueryService?

Can you please let me know if there is any other way to achieve the query by partial key.

score 0 · Answer 1 · answered Sep 10 '18 at 20:42

Well, it could be implemented (optimally) by using a combination of #1 and #2 in your "options" above (depending on whether your application domain object also stored/referenced the key, which would be the case if you were using SD[G] Repositories.

This might be best explained with the docs and an example, particularly using the PartitionResolver interface Javadoc.

Say your "composite" Key was implemented as follows:

class CompositeKey implements PartitionResolver {

  private final Object idOne;
  private final Object idTwo;

  CompositeKey(Object idOne, Object idTwo) {
    // argument validation as necessary
    this.idOne = idOne;
    this.idTwo = idTwo;
  }

  public String getName() {
    return "MyCompositeKeyPartitionResolver";
  }

  public Object getRoutingObject() {
    return idOne;
  }
}

Then, you could invoke a Function that queries the results you desire by using...

Execution execution = FunctionService.onRegion("PartitionRegionName");

Optionally, you could use the returned Execution to filter on just the (complex) Keys you wanted to query (further qualify) when invoking the Function...

ComplexKey filter = { .. };

execution.withFilter(Arrays.stream(filter).collect(Collectors.toSet()));

Of course, this is problematic if you do not know your keys in advance.

Then you might prefer to use the ComplexKey to identify your application domain object, which is necessary when using SD[G]'s Repository abstraction/extension:

@Region("MyPartitionRegion")
class ApplicationDomainObject {

  @Id
  CompositeKey identifier;

  ...
}

And then, you can code your Function to operate on the "local data set" of the Partition Region. That is, when a data node in the cluster hosts the same Partition Region (PR), then it will only operate on the data set in the "bucket" for that PR, which is accomplished by doing the following:

class QueryPartitionRegionFunction implements Function {

  public void execute(FunctionContext<Object> functionContext) {

    RegionFunctionContext regionFunctionContext = 
      (RegionFunctionContext) functionContext;

    Region<ComplexKey, ApplicationDomainObject> localDataSet =
      PartitionRegionHelper.getLocalDataForContext(regionFunctionContext);

    SelectResults<?> resultSet = 
      localDataSet.query(String.format("identifier.idTwo = %s", 
        regionFunctionContext.getArguments);

    // process result set and use ResultSender to send results

  }
}

Of course, all of this is much easier to do using SDG's Function annotation support (i.e. implementing and invoking your Function anyway).

Note that, when you invoke the Function, onRegion using the GemFire's FunctionService, or more conveniently with SDG's annotation support for Function Execution, like so:

@OnRegion("MyPartitionRegion")
interface MyPartitionRegionFunctions {

    @FunctionId("QueryPartitionRegion")
    <return-type> queryPartitionRegion(..);

}

Then..

Object resultSet = myPartitionRegionFunctions.queryPartitionRegion(..);

Then, the FunctionContext will be a RegionFunctionContext (because you executed the Function on the PR, which executes on all nodes in the cluster hosting the PR).

Additionally, you use the PartitionRegionHelper.getLocalDataForContext(:RegionFunctionContext) to get the local data set of the PR (i.e. the bucket, or just the shard of data in the entire PR (across all nodes) hosted by that node, which would be based your "custom" PartitionResolver).

You can then query to further qualify, or filter the data of interests. You can see that I queried (or further qualified) by idTwo, which was not part of the PartitionResolver implementation. Additionally, this would only be required in the (OQL) query predicate if you did not specify Keys in your Filter with the Execution (since, I think, that would take the entire "Key" (idOne & idTwo) into account, based on our properly implemented Object.equals() method of your ComplexKey class).

But, if you did not know the keys in advance and/or (especially if) you are using SD[G]'s Repositories, then the ComplexKey would be part of your application domain abject, which you could then Index, and query on (as shown above: identifier.idTwo = ?).

Hope this helps!

NOTE: I have not test any of this, but hopefully it will point you in the right direction and/or give you further ideas.

Querying GemFire Region by partial key

1 Answers1