6

I have a few date-sharded tables I want to delete but they already have more than 100 shards for each table and can't drop them manually.

I've tried using wildcards

DROP TABLE my_dataset.my_table_*;

but it does not seem to work.

I finally used the python API:

for table_id in tables:
    table_ref = client.dataset(dataset_id).table(table_id)
    client.delete_table(table_ref)

And it works, but I needed to create the tables array with the names of the tables I wanted to drop.

Is there a way to DROP all date-shards of a date-sharded table in BigQuery form the UI?

Or using an SQL command in the UI?

Or using the command line with a wildcard?

Thanks

Sander van den Oord
  • 10,986
  • 5
  • 51
  • 96
Ary Jazz
  • 1,576
  • 1
  • 16
  • 25

3 Answers3

4

And what about instead of creating the tables array (with the names of the tables) you use...

from google.cloud import bigquery
client = bigquery.Client()
dataset_ref = client.dataset('my_dataset')

tables = list(client.list_tables(dataset_ref))  # API request(s), now you have the list of tables in this dataset
queried_tables=[]
for table in tables:
    print(table.table_id)
    if table.table_id.startswith("your_favourite_prefix"): #will perform the action only if the table has the desired prefix
        queried_tables.append(table.table_id)

print(queried_tables) #the list of the desired tables names, now you can use your script to delete them all
Temu
  • 859
  • 4
  • 11
2

If you would like to use SQL and the BigQuery console you can do as follows:

1) Query the INFORMATION_SCHEMA to find all tables that you would like to delete and create a DROP TABLE statement around that, like this:

SELECT 'DROP TABLE ' || 'your_project_id.' || table_schema || '.' || table_name || ';'
FROM `your_project_id.your_dataset.INFORMATION_SCHEMA.TABLES`
WHERE table_name LIKE 'your_table_name_202202%'
ORDER BY table_name ASC;

2) The result of this query will look like this:

DROP TABLE your_project_id.your_dataset.sales.__am-123;
DROP TABLE your_project_id.your_dataset.sales.__am-134;
etc etc

3) Now copy the result of this query to the clipboard.

save results of query to clipboard

4) Then paste that to a new query and run that to drop all the tables.


See also here: Delete BigQuery tables with wildcard

Sander van den Oord
  • 10,986
  • 5
  • 51
  • 96
1

There is no built in way to drop all tables sharing a common prefix. Your approach of using the Python library to delete them all is a reasonable option, or you could do the same thing from the command-line with a loop that calls bq rm dataset.table_name.

Elliott Brossard
  • 32,095
  • 2
  • 67
  • 99