My Python application needs to export BigQuery tables into small CSV files in GCS (like smaller than 1GB).
I referred to the document, and wrote the following code:
from google.cloud import bigquery
bigquery.Client().extract_table('my_project.my_dataset.my_5GB_table',
destination_uris='gs://my-bucket/*.csv')
The size of my_5GB_table
is approximately 5GB.
But it results in a single 10GB CSV file in GCS.
I tried with other tables with various numbers of size, and then some resulted in divided files of about 200MB, and others in a single huge file too.
The doc says as if tables are always divided into 1GB files, but now I don't know the rules where the files are divided.
Q1 How to make sure that tables are always divided into smaller than 1GB files ?
Q2 Can't I specify the size of files into which tables are divided ?