I'm trying to push data from gcs to big query table and using airflow operator GCSToBigQueryOperator. Below is what I have
parquet_to_bq = GCSToBigQueryOperator(
bigquery_conn_id="dev",
task_id="gcs_to_bq_task",
bucket="bucket_id",
source_format="PARQUET",
source_objects=['test/*'],
destination_project_dataset_table="table_name",
write_disposition='WRITE_TRUNCATE',
impersonation_chain=IMPERSONATE_SERVICE_ACCOUNT
)
The gcs bucket path is in the below format
bucket_id/test/op_dt=2021-01-01/1.parquet
bucket_id/test/op_dt=2021-01-02/2.parquet
The table in big query is partitioned on op_dt. When I execute the dag I'm getting the following error:
google.api_core.exceptions.BadRequest: 400 The field specified for partitioning cannot be found in the schema
I want to load all the partitions from gcs to bigquery. What modifications do I need to make for this operator to work?