I configured a DataStream in GCP to replicate data from MySQL to BigQuery. It was working successfully for some time and initial data sump is successful to BQ, but after that, I keep on getting the error,
{
"insertId": "c8c68035-bd9b-4e5d-a350-5abf2bc33163",
"jsonPayload": {
"event_code": "MYSQL_CDC_FETCH_FAILED",
"message": "CDC fetch failed: We couldn't connect to the MySQL server 35.196.212.155 on port 3306 as our request timed out. Make sure that the connection info is correct, the host is up and reachable on that port, and Datastream IPs were allowlisted..",
"original_message": "(2003, \"Can't connect to MySQL server on 'xxx.xxx.xxx.xxx' (timed out)\")",
"read_method": "mysql-cdc-binlog",
"context": "CDC"
},
"resource": {
"type": "datastream.googleapis.com/Stream",
"labels": {
"location": "us-east1",
"resource_container": "projects/444429740053",
"stream_id": "replication-stream-json"
}
},
"timestamp": "2022-09-19T06:21:18.932065Z",
"severity": "ERROR",
"logName": "projects/*****/logs/datastream.googleapis.com%2Fstream_activity",
"receiveTimestamp": "2022-09-19T06:21:19.879866597Z"
}
I use IP whitelisting in MySQL to read from DataStream. I see the pipeline in the failed state now and this is what it looks like,
But if I choose a particular table from the object page and choose "Initiate Backfill", it still works.
I don't have any clue about this behavior and would appreciate any help/directions on this.
Edit: Adding further observations, I'm having 2 SQL MySQL instances in ClouSQL. One is very active (lots of updates/inserts/deletes) and another one is inactive. And the problem of incremental changes replication happens only on the active instance, if I make some changes manually (to see if replication happens) in the inactive instance it's getting replicated to BQ in around 15 mins.