0

Using GCSToBigQueryOperator this error occur

Broken DAG: [/opt/airflow/dags/injest_data.py] Traceback (most recent call last):
File "/opt/airflow/dags/injest_data.py", line 79, in <module>
>     "sourceUris": [f"gs://{BUCKET_NAME}/*"],
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/models/baseoperator.py", line 397, in apply_defaults
raise AirflowException(f"missing keyword arguments {display}")
airflow.exceptions.AirflowException: missing keyword arguments 'bucket', 'destination_project_dataset_table','source_objects'****

And when i tried to change to BigQueryCreateExternalTableOperator This other error occur

Broken DAG: [/opt/airflow/dags/injest_data.py] Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/models/baseoperator.py", line 411, in apply_defaults
result = func(self, **kwargs, default_args=default_args)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/models/baseoperator.py", line 760, in __init__
f"Invalid arguments were passed to {self.__class__.__name__} (task_id: {task_id}). "
airflow.exceptions.AirflowException: Invalid arguments were passed to BigQueryCreateExternalTableOperator (task_id: bq_external_table_task). Invalid arguments were:
**kwargs: {'tables_resouces': {'tableReferences': {'projectId': 'de-projects-373304', 'datasetId': 'stockmarket_dataset', 'tableId': 'stockmarket_ex

Thanks in advance for your help...

I have tried to change the google query operators and even try to used different method to upload the data to bigquery but says schema dont exist, please i need help to understand what am doing wrong. Thanks in advance for your help, below is the code causing the error

    bq_external_table_task = BigQueryCreateExternalTableOperator(
            task_id = "bq_external_table_task",
            tables_resouces = {
                "tableReferences": {
                    "projectId": PROJECT_ID,
                    "datasetId": BIGQUERY_DATASET,
                    "tableId":f"{DATASET}_external_table",
                },
                "externalDataConfiguration": {
                    "autodetect": True,
                    "sourceFormat": f"{INPUT_FILETYPE.upper()}",
                    "sourceUris": [f"gs://{BUCKET_NAME}/*"],
                },
            },
            
        )

Hussein Awala
  • 4,285
  • 2
  • 9
  • 23
Efe
  • 1
  • 1
  • @Hussein i need your help, i have a problem airflow is failing to connect to GCP, I followed what you told someone to do and it helps for him but for me the process is not helping me, here is the error and this are what i did, used sftp to send the credentials to GCP because am runing on SSH, and connect to airflow but if fails and export the airflow_con_default that didnt work even using the credentials file path to connect with gcp credential but it didnt work, this are my errors. """ FileNotFoundError: [Errno 2] No such file or directory: '/.google/credentials/google_credentials.json' – Efe Feb 05 '23 at 07:33
  • Since this is not related to the original issue of this question, I prefer you create a new question providing all the context, and I'll try to help you solving it. – Hussein Awala Feb 10 '23 at 00:47
  • Thank you Hussein, i have created a new issue on this below https://stackoverflow.com/questions/75350905/airflow-cant-connect-to-google-credentials – Efe Feb 10 '23 at 03:59

1 Answers1

1

There is no sourceUris named parameter in GCSToBigQueryOperator. It should have source_objects. Kindly check the operator's parameters from below official document: GCSToBigQueryOperator

Your BigQueryCreateExternalTableOperator has also wrong parameter names. tables_resouces should have table_resource. You can also check this operator's parameters from official document: BigQueryCreateExternalTableOperator

mhmtersy
  • 53
  • 4