I am trying to create an external table in Big Query for a Parquet file that is present on the GCS bucket. But when I am running the below code in airflow getting an error:
ERROR:
[2023-07-04, 10:03:44 UTC] {taskinstance.py:1770} ERROR - Task failed with exception
Traceback (most recent call last):
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/operators/bigquery.py", line 1712, in execute
table = bq_hook.create_empty_table(
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/common/hooks/base_google.py", line 468, in inner_wrapper
return func(self, *args, **kwargs)
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 413, in create_empty_table
return self.get_client(project_id=project_id, location=location).create_table(
File "/opt/python3.8/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 779, in create_table
api_response = self._call_api(
File "/opt/python3.8/lib/python3.8/site-packages/google/cloud/bigquery/client.py", line 813, in _call_api
return call()
File "/opt/python3.8/lib/python3.8/site-packages/google/api_core/retry.py", line 349, in retry_wrapped_func
return retry_target(
File "/opt/python3.8/lib/python3.8/site-packages/google/api_core/retry.py", line 191, in retry_target
return target()
File "/opt/python3.8/lib/python3.8/site-packages/google/cloud/_http/__init__.py", line 494, in api_request
raise exceptions.from_http_response(response)
google.api_core.exceptions.BadRequest: 400 POST https://bigquery.googleapis.com/bigquery/v2/projects/idmp-mii-dev-ddb5/datasets/tivo_site_activity_0/tables?prettyPrint=false: CsvOptions can only be specified if storage format is CSV.
DAG CODE:
create_imp_external_table = BigQueryCreateExternalTableOperator(
task_id=f"create_imp_external_table",
bucket='my-bucket',
source_objects=["/data/userdata1.parquet"], #pass a list
destination_project_dataset_table=f"my-project.my_dataset.parquet_table",
source_format='PARQUET', #use source_format instead of file_format
)
Composer version: 2.3.2
Airflow version: 2.5.1