I'm trying to move a 74 GB CSV from AWS S3 to BigQuery using the Big Query Data Transfer Service. It's been 9 hours, and it's still not done. The logs don't show any errors, but one message keeps showing up:
Transfer from Amazon S3 to Google Cloud in progress. Currently discovered 1 object(s) and transferred 0 object(s); discovery and transfer still in progress...
Has anyone else had long waits with DTS? I'm wondering:
- Is it normal for it to take this long?
- Are there any limits in DTS or BigQuery that might slow things down?
- Could AWS or GCP be slowing down the transfer?
- Are there any tools I can use to see how much data has been moved so far?
- Are there faster ways to move data from AWS to GCP?
I checked logs in logs explorer, but it doesn't contain any errors. 97% of all log entries are same, I am sharing a representative log entry here, redacting some information which is not relevant to the issue.
{
"insertId": "...",
"jsonPayload": {
"message": "Transfer from Amazon S3 to Google Cloud in progress. Currently discovered 1 object(s) and transferred 0 object(s); discovery and transfer still in progress..."
},
"resource": {
"type": "bigquery_dts_config",
"labels": {
"project_id": "...",
"config_id": "...",
"location": "asia-northeast3"
}
},
"timestamp": "2023-08-18T11:29:29.219065235Z",
"severity": "INFO",
"labels": {
"run_id": "..."
},
"logName": "...",
"receiveTimestamp": "2023-08-18T11:29:29.965047596Z"
}
This is also the most recent log entry. It makes me think that probably the transfer has not even started yet but I am not sure. I don't understand what an 'object' represents here. If it is the whole table, then it won't be marked transferred until all the data in it has been transferred.
Furthermore, there is no data in the Big Query table so far and it shows a total of 0 bytes under storage information for the table in Big Query.
Thanks for any help or tips you can share!