I Have a simple eks cluster which is running 3 nodes and each node is in each AZ and every AZ has an availability zone. According to the application architecture no application communicates each other except to Radis cluster which is in different vpc but vpc peering is done.
I have data transfer of close to 1.7gb every hr and when I tried to do vpc flow logs with below query I get that 1.7gb data
filter (dstAddr like 'x.x.x.x' and srcAddr like 'y.y.')
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10
Here x.x.x.x is my NAT gateway ip and y.y is the vpc ip, this Cleary shows that request coming to this NAT gateway is getting routed back to vpc itself and not to public internet and this traffic itself is 1.7gb per hr. But in other 2 NAT gateways in other regions doesn't have this much traffic (even though each NAT way's subnet has just 1 eks node for now and few pods running in it)
I am trying to find which pod or why so much data transfer happen even though pods don't communicate each other by design of the application. Please give me some tools or ides to debug this issue