I am running a dataflow job (Apache Beam SDK 2.1.0 Java, Google dataflow runner) and I need to read from the Google DataStore "distinctly" on one particular property. (like the good old "DISTINCT" keyword in SQL). Here is my code snippet :
Query.Builder q = Query.newBuilder();
q.addKindBuilder().setName("student-records");
q.addDistinctOn(PropertyReference.newBuilder().setName("studentId").build());
pipeline.apply(DatastoreIO.v1().read().withProjectId("project-id").withQuery(q.build()));
pipelilne.run();
When the pipeline runs, the read() fails due to the following error:
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: com.google.datastore.v1.client.DatastoreException: Inequality filter on key must also be a group by property when group by properties are set., code=INVALID_ARGUMENT
Could someone please tell me where I am going wrong.