1

We are having a high traffic (100K hit/day) Drupal news website hosted on AWS. Its behind cloudflare & 2 Load Balanced Varnish servers. For some reason the bandwidth usage is very high for RDS. This is after having all the cache tables hosted in memcache in Elastic Cache. 100% of the traffic are anonymous. Except for new or updated content the traffic is served from varnish.

But still the RDS Bandwidth is very high. For example by 18th of this month the usage is already 15TB+. This cost is killing the whole site.

How can we detect what is eating all the bandwidth? How do we go and find out the root cause?

See the detail copied our my billing page:

Bandwidth
$0.000 per GB - data transfer in per month  -   4.808 GB  -     $0.00
$0.000 per GB - first 1 GB of data transferred out per month    -  1 GB  -  $0.00
$0.010 per GB - regional data transfer - in/out/between EC2 AZs or using IPs or ELB  -  15,147.744 GB   - $151.48
$0.120 per GB - up to 10 TB / month data transfer out   - 20.759 GB      -  $2.49


Total: $153.97
Region Total:   $154.04
Safwan Erooth
  • 273
  • 1
  • 6

2 Answers2

6

You should always use the private IP addresses to communicate between your various infrastructure components (RDS, ElastiCache, whatever). If you use the public IP address, then you will be billed for regional data transfer, because the traffic leaves and re-enters AWS.

Check your application carefully for something that is inappropriately accessing a backend component using a public IP address.

Michael Hampton
  • 244,070
  • 43
  • 506
  • 972
1

The item "regional data transfer - in/out/between EC2 AZs or using IPs or ELB" accounts for data transfer:

  • Between EC2 instances not using the private IP
  • Between EC2 instances on different availability zones
  • Between EC2 instances and ELBs

If your cost for "regional data transfer - in/out/between EC2 AZs or using IPs or ELB" is $151.48, then that means the sum of data transferred whitin these 3 itens is 15.148 TB.

So always use the private IP address when transfering data on EC2 on the same availability zone. That's probably your problem.

If you want to get rid of this cost, you should use all EC2 instances on the same availability zone, but I wouldn't recommend. If something happens to the selected availability zone, your service would probably be offline.

You can use AWS Billing and Cost Management reports to know which ones are the largest contributors to that expense, just add tags to your instances and in the AWS Billing and Cost Management console, create a Report on S3 bucket.

Zeus
  • 111
  • 3