I am trying to automate the tables creation in BQ by reading bucket raw files(based on the bucket file name it should creat same tables )which will use yml as a configuration. Can anyone provide a lead on this as how to write with sone code sample.
Asked
Active
Viewed 90 times
1 Answers
0
I am doing something similar. It also depends on how you want to read the bucket "raw files" which in my case is a GCS notification + PubSub.
Very simple example:
uri = "gs://" + event['attributes']['bucketId'] + "/" + filename
table = os.path.splitext(filename)[0]
#Format the filename to only numbers and letters (no special characters)
table = re.sub('[^A-Za-z0-9]+', '', table)
# Construct a BigQuery client object.
client = bigquery.Client()
# Name of the table which will be created automatically by BigQuery
table_id = project_id + "." + dataset_id + "." + table
job_config = bigquery.LoadJobConfig(
autodetect=True,
source_format=bigquery.SourceFormat.CSV,
)
load_job = client.load_table_from_uri(
uri, table_id, job_config=job_config
)
BigQuery job will create the table automatically if it does not exist.

CaioT
- 1,973
- 1
- 11
- 20
-
Thanks for your input and wanted to understand how are you reading the tables and other details like dataset, bucket, etc Are it from the config file? If so can you please elaborate more on this – SKP Aug 20 '21 at 19:12