0

I'm migrating data from Amazon S3 to Bigquery using Bigquery Data Transfer Service, the source files are some CSV that are being generated to a specific bucket. Normally on this bucket (s3://landing_data/to_load/) I've two type of files:

33dd21234-0fab-4118-93de-cdf44321.csv
33dd21234-0fab-4118-93de-cdf44321.csv.metadata

And my goal is to load the csv file into bigquery table for that I've this DTS configuration:

{
"destination_dataset_id": "landing_data"
"display_name": "data_load_from_s3"
"data_source_id": "amazon_s3",#Connection type
"schedule_options": {"disable_auto_scheduling": True},
"params": {
    "destination_table_name_template": "bigquery_table_id"
    "file_format": "CSV",
    "data_path": "s3://landing_data/to_load/*.csv",
    "access_key_id": aws_access_conn_id
    "secret_access_key": aws_secret_conn_id
    'field_delimiter': ',',
    'max_bad_records': '0',
    'skip_leading_rows': '1',
    "ignore_unknown_values": True,
    "allow_jagged_rows": True
}
}

The DTS job runs well but in most of the times it gives me the error:

"No files found matching: "s3://landing_data/to_load/*.csv""

This is strange because in most of the times I got this error but if I run this continuously it will be able to load the data correctly. Specially if the data volume increases. Sometimes the volume can be very small (don't know it is the problem) but I don't know if I am making any error to receive this message.

Anyone has an idea?

Pedro Alves
  • 1,004
  • 1
  • 21
  • 47

0 Answers0