I have a Cloud Dataflow job which is writing everything to a single table and a single column family. How to modify this job to write to multiple tables and column families which may or may not exist? E.g., if a table or column family doesn't exist, create one and then write to it.
Asked
Active
Viewed 873 times
0

Misha Brukman
- 12,938
- 4
- 61
- 78

Sanjay Setia
- 43
- 5
-
Have you checked BigTableIO? What's the behavior of BigTableIO if it reads and writes to something non-existing? – Rui Wang Feb 08 '19 at 19:26
-
I did some quick checks, seems like BigTableIO will always assume table exists (or does validation and fail if table not exists). – Rui Wang Feb 08 '19 at 19:34
1 Answers
2
This example should show you how to check to see if a table exists, and create it if not.
The Admin
interface from that code sample can also be used to get the column families for a table (getTableDescriptor
) and create one if necessary (addColumn
).

Gary Elliott
- 929
- 5
- 6