Track what is causing AWS RDS to use insanely high bandwidth

Question

We are having a high traffic (100K hit/day) Drupal news website hosted on AWS. Its behind cloudflare & 2 Load Balanced Varnish servers. For some reason the bandwidth usage is very high for RDS. This is after having all the cache tables hosted in memcache in Elastic Cache. 100% of the traffic are anonymous. Except for new or updated content the traffic is served from varnish.

But still the RDS Bandwidth is very high. For example by 18th of this month the usage is already 15TB+. This cost is killing the whole site.

How can we detect what is eating all the bandwidth? How do we go and find out the root cause?

See the detail copied our my billing page:

Bandwidth
$0.000 per GB - data transfer in per month  -   4.808 GB  -     $0.00
$0.000 per GB - first 1 GB of data transferred out per month    -  1 GB  -  $0.00
$0.010 per GB - regional data transfer - in/out/between EC2 AZs or using IPs or ELB  -  15,147.744 GB   - $151.48
$0.120 per GB - up to 10 TB / month data transfer out   - 20.759 GB      -  $2.49


Total: $153.97
Region Total:   $154.04

$154 doesn't seem like a site-breaking cost. RDS by itself costs more than that. — Joel E Salas, Dec 17 '15 at 22:55
But more seriously it's a bit confusing. Where is the ~1TB/day being measured in relation to your infrastructure ? — user9517, Dec 17 '15 at 23:04
$154 is just for the transfer, above that we have all the other charges. — Safwan Erooth, Dec 18 '15 at 06:54

score 6 · Accepted Answer · answered Dec 17 '15 at 23:11

6

You should always use the private IP addresses to communicate between your various infrastructure components (RDS, ElastiCache, whatever). If you use the public IP address, then you will be billed for regional data transfer, because the traffic leaves and re-enters AWS.

Check your application carefully for something that is inappropriately accessing a backend component using a public IP address.

answered Dec 17 '15 at 23:11

Michael Hampton

244,070
43
506
972

Any other way to know what is the bandwidth, what queries or what instances are taking to much from RDS? – Safwan Erooth Dec 18 '15 at 06:54
That's a completely different question. – Michael Hampton Dec 18 '15 at 06:54
You recommend that I create a new question or can you give me some pointers? – Safwan Erooth Dec 18 '15 at 11:48
Look for `SELECT *` or tables with large `TEXT`/`BLOB`/`BINARY` fields. – ceejayoz Jun 08 '16 at 01:18

Zeus · Answer 2 · 2016-06-08T01:14:33.150

The item "regional data transfer - in/out/between EC2 AZs or using IPs or ELB" accounts for data transfer:

Between EC2 instances not using the private IP
Between EC2 instances on different availability zones
Between EC2 instances and ELBs

If your cost for "regional data transfer - in/out/between EC2 AZs or using IPs or ELB" is $151.48, then that means the sum of data transferred whitin these 3 itens is 15.148 TB.

So always use the private IP address when transfering data on EC2 on the same availability zone. That's probably your problem.

If you want to get rid of this cost, you should use all EC2 instances on the same availability zone, but I wouldn't recommend. If something happens to the selected availability zone, your service would probably be offline.

You can use AWS Billing and Cost Management reports to know which ones are the largest contributors to that expense, just add tags to your instances and in the AWS Billing and Cost Management console, create a Report on S3 bucket.

Track what is causing AWS RDS to use insanely high bandwidth

2 Answers2