0

We're experiencing a very slow connection with RDS (Oracle) when connecting to its public IP from internet (outside VPC with a 1Gbps internet connection). The RDS instance is in a public subnet with a public IP enabled. I do a simple select from a large table and I get 300 row/s when connected to the database (via its public IP) from my computer, but from an EC2 inside the same VPC, I get +10000 row/s for the same request. I am located in France and rds db is in Frankfurt datacenter, I'm guessing this is due to a network bottleneck.

What I've tried so far :

  • EC2 in same VPC connecting to RDS (via private IP), I get ~10000 row/s
  • EC2 in same region -> RDS (via public IP): ~10000 row/s
  • local computer -> RDS (via public IP): 300 row/s
  • local computer -> SSH in EC2 bastion in same VPC as RDS -> RDS (private IP): 300 row/s
  • Copy DB to Paris datacenter and access it over public IP from my computer: 1000 row/s

I am running a t3.small instance with a gp2 100Go disk but I've tried upgrading to m5.large, with a larger 500Go disk (as this would increase the IOPS) but I'm getting the exact same results, still stuck with network limitations.

Thing is, we have to copy large amounts of data on a regular basis, from an on-prem db to rds db. From the results I get, this is due to a network bottleneck but I wanted to know if anyone had the same use-case and found a workaround to this. I've searched stack overflow (and others) and found similar questions with no answers :(

I would prefer to avoid a costly Direct Connect and I'm not sure an AWS VPN connection would speed that up...

Thanks !

MT0
  • 143,790
  • 11
  • 59
  • 117
bertrandp
  • 3
  • 3
  • You say your goal is to "copy large amounts of data on a regular basis, from an on-prem db to rds db", but your examples are selecting data from Amazon RDS to your own computer, which appears to be the opposite direction. In which direction do you wish to transfer the data? – John Rotenstein Oct 27 '22 at 22:55
  • Yes, sorry for the confusing goal, we first tried to copy these data, and it was super slow, so we tried to narrow down the limitations. We saw that even select operations were slow so this was our way to find where was the issue. But in the end, we will have daily tasks that will insert rows from another DB to RDS. – bertrandp Oct 28 '22 at 08:46

1 Answers1

1

This is the result of bandwidth and latency, caused by the 'distance' of your connection to the database.

AWS has high bandwidth, but there are some network limitations based on the size of the database instance. However, the bigger impact is probably from latency caused by the distance between your computer (on the Internet) and the database. This is caused both by physical distance and the number of 'hops' between systems required to connect the two endpoints.

Further, the protocol being used to 'select' the data is not efficient for sending data. It is probably quite 'chatty', waiting for acknowledgements of data and would likely be uncompressed.

Since your goal is to "copy large amounts of data on a regular basis", I would recommend:

  • Extract the data 'locally' (if you are extracting from Amazon RDS, then dump the data into S3)
  • Compress the data (if possible)
  • Copy the data to the destination (eg to/from S3)
  • Decompress and load the data into the database 'locally'

This will minimise the 'chatty' nature of a SELECT from a database. However, but it might increase the time from extract to load since it is going through multiple extract/compress steps.

You could also look at using tools that perform incremental extracts of data so that the whole database is not being copied (eg AWS Database Migration Service).

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • Thanks a lot for your answer ! That is was I assumed regarding the results, but needed confirmation :) I guess, send/extract/load to db directly will be my best option, I will look into migration service as well – bertrandp Oct 28 '22 at 08:50