Background
We have a worker application server that does long running reporting export jobs. Since they are export jobs we connected it to a (managed, not serverless) Aurora database cluster with a write master and read replicas that auto-scale with the following scaling policy:
The worker uses the read endpoint that comes out-of-the-box with the db cluster that should be distributing load evenly on the existing readers.
Problem
We noticed that the export jobs that are attempting to connect to the read db cluster are failing with this error:
SQLSTATE[HY000]: General error: 7 SSL SYSCALL error: EOF detected
we were able to verify that this error happens exactly when autoscaling happens b/c we cross referenced the timing of the errors:
to make the problem worse, all subsequent export attempts fail with the same error (even after the auto-scaling is over!). The only way were able to fix this is by restarting the worker application servers.
Question
What can we do to let our database cluster gracefully handle incoming db connections to the read replicas while it's scaling? Or how do we force the worker to re-establish a new connection if it finds that the current one is terminated?