My ingestion pipeline is the following
queue -> ec2 -> rds
I have observe the following, when I have first turn on my ec2 to ingest from sqs the write/second on rds is really fast. But after 4 hours the sql write time starts to increase.
Here is a graph for illustration
Write per second https://i.stack.imgur.com/XavU0.jpg
Queue Depth https://i.stack.imgur.com/vXkGr.jpg
There is still alot of message in the queue so I verify that not enough data from sqs is not the problem source. I try logging the sql write time. From update to commit, and found out the latency for writing has increase 10 fold so its probably not the cpu credit issue from ec2. But if I turn the ec2 ingestion off and turn it on a day later I see great performance again.
Here is what I verify
- rds cpu credit did not drop too much, barely drop any
- ec2 cpu credit did drop from 150 to 6, but the latency increase in rds starts around 2am and ec2 exhaust its cpu credit around 9am and the bottleneck is still the sqsl write time which increase from 0.002 to 0.04
I am using 2 ec2 t2.micro and 1 t2.medium rds.
I am suspecting there is some network bandwidth limit that I need to change. Or maybe cpu credit on ec2 somehow increase latency on sql write?
Can someone point me the right direction?