-1

I have a certain table in Google Big Query which has some sensitive fields. I read and understood about inspection of data but cannot find a way to redact the data using DLP API directly in BigQuery database.

Two questions:

  1. Is it possible to do it just using DLP API?
  2. If not, what is the best way to fix data in a table which runs into Terabytes?
Paulie_D
  • 107,962
  • 13
  • 142
  • 161
kawadhiya21
  • 2,458
  • 21
  • 34

1 Answers1

2

The API does not yet support de-identifying bigquery directly.

You can however write a dataflow pipeline that leverages content.deidentify. If you batch your rows utilizing Table objects (https://cloud.google.com/dlp/docs/reference/rest/v2/ContentItem#Table) this can work pretty efficiently.

Jordanna Chord
  • 950
  • 5
  • 12