0

I am streaming data into BQ, every day I run a scheduled job in Dataprep that takes 24 hours of data and modifies some data and creates a new table in the BQ dataset with 24 hours of data.

The original table though stays unmodified and keeps on gathering data.

What I would like to do is delete all rows in the table after the dataprep makes a copy so that a new 24 hours of data streaming is gathered

How can I make this automated, I can't seem to find anything in dataprep that drops the original table and creating a new table.

  • There are 2 options on this thread https://stackoverflow.com/questions/50621604/bigquery-delete-on-streaming-table – Bobbylank Nov 28 '18 at 12:29

2 Answers2

3

You can do this setting up your table as partitioned table due to you are ingesting data constantly.

This option is to do it manually:

bq rm '[YOUR_DATASET].[YOUR_TABLE]$xxxxxxx'

And with the expiration time you can set the time when the data of the table will be deleted:

bq update --time_partitioning_expiration [INTEGER] [YOUR_PROJECT_ID]:[YOUR_DATASET].[YOUR_TABLE]
Alex Riquelme
  • 1,475
  • 7
  • 14
1

You could use a Scheduled Query to clear out the table:

https://cloud.google.com/bigquery/docs/scheduling-queries

Scheduled queries support DDL so you could schedule a query that deletes all the rows from this table, or deletes the table entirely, on a daily basis. at a specific time.

Ben P
  • 3,267
  • 4
  • 26
  • 53
  • I have tried that but because the streaming buffer is still active I get this error `Error: UPDATE or DELETE DML statements are not supported over table with streaming buffer` – user3895426 Nov 28 '18 at 10:35