I have a node.js app (express) running in docker deployed on AWS Lightsail Container Service. It uses a postgres database on AWS RDS (via knex).
This might be several related problems.
The connection usually works fine for about 2 days, then sometimes after running migrations, the app on lightsail is not able to connect to RDS.
For some time I still can connect to the RDS DB from DBeaver, my local running non-dockerized app and also my local running dockerized version of the app but after a while these connections also stops working.
At that point there is no way to access the RDS DB anymore.
I then start to do things like deleting and recreating the VPC Peering (Lightsail VPC - default VPC), delete and re-add the inbound rules to the security group, create another database. Basically everything I can think of. At some point the connection starts working again. Unfortunately this time it still doesn't work from the app on lightsail, but only from my dev machine.
Things I triple-checked:
- the DATABASE_URL, for debugging I made a route to print the env to be able to make sure that the correct DATABASE_URL connection string is set. The same string is working fine for connections from my dev machine.
- the AWS RDS DB is publicly accessible
- the security group for the AWS RDS DB has inbound rules for the Lightsail VPC CIDR, it actually has a complete open rule for ip4 and ip6 at the moment
- I rebooted the RDS database
- I added another RDS db to which I also can not connect from lightsail but from my dev machine
The error that shows in the Lightsail Docker Logs is
[23/Oct/2022:16:28:42] Error: connect ECONNREFUSED 127.0.0.1:5432
[23/Oct/2022:16:28:42] at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
- Is it odd that it says
127.0.0.1
? - I use
ssl: false
- Because the issue usually happens after rolling back and re-running 29 migrations and seeding the tables I thought it might be a max-connection issue, but that might only explain the initial problem but not why I now can connect from my machine but not from lightsail.
- Do I have to configure docker to allow outbound traffic on port 5432?
- It's also strange that I'm usually not able to access any RDS database when I have the problem.
- rolling back the lightsail app to a previously working version works but I don't know how to download the image for further inspection
- regarding the maintenance window, I'm not sure if it's related. My main problem is that I still can't connect from Lightsail even after more than 24 hours. I suspect it's something in the AWS network layer or in the docker networking config.
- (I don't know how to use the AWS Reachablity analyzer because I don't know what to use for the lightsail container app)
What could it be that I have overlooked?