0

I have a Cloud Dataflow job which is writing everything to a single table and a single column family. How to modify this job to write to multiple tables and column families which may or may not exist? E.g., if a table or column family doesn't exist, create one and then write to it.

Misha Brukman
  • 12,938
  • 4
  • 61
  • 78
  • Have you checked BigTableIO? What's the behavior of BigTableIO if it reads and writes to something non-existing? – Rui Wang Feb 08 '19 at 19:26
  • I did some quick checks, seems like BigTableIO will always assume table exists (or does validation and fail if table not exists). – Rui Wang Feb 08 '19 at 19:34

1 Answers1

2

This example should show you how to check to see if a table exists, and create it if not.

The Admin interface from that code sample can also be used to get the column families for a table (getTableDescriptor) and create one if necessary (addColumn).

Gary Elliott
  • 929
  • 5
  • 6