1

I am trying to conditionally append to a column cell value in BigTable using the Java client libraries for Dataflow, Apache Beam, and BigTable based on two filters:

  1. Filtering for a specific row key.
  2. Filtering for a regex in a specific column for that row

I have discovered that I can conditionally write a column cell value to a row by passing the filters to a ConditionalRowMutation inside of a ParDo step, or append to a column value directly (without defining the conditional filters) using a ReadModifyWriteRow but I have not been able to find a solution where the two can be used in conjunction.

I have also noticed that the logic for writing to BigTable using append can not be performed using the CloudBigTableIO so it looks like I am limited to performing all BigTable writing/appending inside the ParDo function. Since this doesn't use the Dataflow BigTable connector library, I am worried about potential issues with retries causing duplicate mutations to be executed, as well as race conditions causing inconsistencies in the data written.

Is there a way that I can conditionally append to a cell in BigTable inside of a Google Cloud Dataflow pipeline step safely? Any information would be greatly appreciated.

eagerbeaver
  • 127
  • 1
  • 10

0 Answers0