0

I designed to keep raw data from IoT devices to Cloud DataStore via GAE Flex (PHP). I also want to bring those data to BigQuery via Cloud DataFlow. However, I cannot find the standard or official documents which express the ways to read and dump data among DataStore and DataFlow services.

Dan McGrath
  • 41,220
  • 11
  • 99
  • 130
Suthat
  • 115
  • 1
  • 9

1 Answers1

1

The easiest way to achieve this is by using BigQuery's ability to load Cloud Datastore backups. Essentially just schedule a regular backup into a GCS bucket, then load the backup from GCS into BigQuery. [documentation].

If you want to use Dataflow, you can use the DatastoreIO source in Java or Python (sorry, no PHP here). [documentation]

Read results from a query into a PCollection:

Pipeline p = Pipeline.create(options);
PCollection<Entity> entities = p.apply(
   DatastoreIO.v1().read()
       .withProjectId(projectId)
       .withQuery(myQueryObject));

Then write this PCollection to wherever you want the data.

Dan McGrath
  • 41,220
  • 11
  • 99
  • 130