0

I have a node.js app (express) running in docker deployed on AWS Lightsail Container Service. It uses a postgres database on AWS RDS (via knex).

This might be several related problems.

The connection usually works fine for about 2 days, then sometimes after running migrations, the app on lightsail is not able to connect to RDS.

For some time I still can connect to the RDS DB from DBeaver, my local running non-dockerized app and also my local running dockerized version of the app but after a while these connections also stops working.

At that point there is no way to access the RDS DB anymore.

I then start to do things like deleting and recreating the VPC Peering (Lightsail VPC - default VPC), delete and re-add the inbound rules to the security group, create another database. Basically everything I can think of. At some point the connection starts working again. Unfortunately this time it still doesn't work from the app on lightsail, but only from my dev machine.

Things I triple-checked:

  • the DATABASE_URL, for debugging I made a route to print the env to be able to make sure that the correct DATABASE_URL connection string is set. The same string is working fine for connections from my dev machine.
  • the AWS RDS DB is publicly accessible
  • the security group for the AWS RDS DB has inbound rules for the Lightsail VPC CIDR, it actually has a complete open rule for ip4 and ip6 at the moment
  • I rebooted the RDS database
  • I added another RDS db to which I also can not connect from lightsail but from my dev machine

The error that shows in the Lightsail Docker Logs is

[23/Oct/2022:16:28:42] Error: connect ECONNREFUSED 127.0.0.1:5432
[23/Oct/2022:16:28:42] at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1148:16)
  • Is it odd that it says 127.0.0.1?
  • I use ssl: false
  • Because the issue usually happens after rolling back and re-running 29 migrations and seeding the tables I thought it might be a max-connection issue, but that might only explain the initial problem but not why I now can connect from my machine but not from lightsail.
  • Do I have to configure docker to allow outbound traffic on port 5432?
  • It's also strange that I'm usually not able to access any RDS database when I have the problem.
  • rolling back the lightsail app to a previously working version works but I don't know how to download the image for further inspection
  • regarding the maintenance window, I'm not sure if it's related. My main problem is that I still can't connect from Lightsail even after more than 24 hours. I suspect it's something in the AWS network layer or in the docker networking config.
  • (I don't know how to use the AWS Reachablity analyzer because I don't know what to use for the lightsail container app)

What could it be that I have overlooked?

einSelbst
  • 2,099
  • 22
  • 24

1 Answers1

0

I found the issue. I was loading the wrong config because an environment var wasn't set.

export function getConfig(processVariables: ProcessVariables): Config {
  const environment: Environment = processVariables.ENV || 'local'
  switch (environment) {
    case 'production':
      return getProductionConfig(processVariables)
    case 'localdocker':
      return getLocalDockerConfig(processVariables)
    case 'local':
      return getLocalConfig(processVariables)
  }
}

For localdocker I did set the ENV env var, but not for production. Whenever I switched the database connection for local because I was running migrations against the live db and deployed before I changed the database connection back, production was working.

What helped confirm the issue was debugging the knex connection string.

I do hope I don't experience the issue again where I can't connect to RDS at all.

einSelbst
  • 2,099
  • 22
  • 24