I designed to keep raw data from IoT devices to Cloud DataStore via GAE Flex (PHP). I also want to bring those data to BigQuery via Cloud DataFlow. However, I cannot find the standard or official documents which express the ways to read and dump data among DataStore and DataFlow services.
Asked
Active
Viewed 253 times
1 Answers
1
The easiest way to achieve this is by using BigQuery's ability to load Cloud Datastore backups. Essentially just schedule a regular backup into a GCS bucket, then load the backup from GCS into BigQuery. [documentation].
If you want to use Dataflow, you can use the DatastoreIO source in Java or Python (sorry, no PHP here). [documentation]
Read results from a query into a PCollection:
Pipeline p = Pipeline.create(options);
PCollection<Entity> entities = p.apply(
DatastoreIO.v1().read()
.withProjectId(projectId)
.withQuery(myQueryObject));
Then write this PCollection to wherever you want the data.

Dan McGrath
- 41,220
- 11
- 99
- 130