My IIS server has always experiencing strange, intermittent 20-30s lag spikes on fairly standard API calls in my C# app. I'll add the server details below in case anyone is interested
In order to see if my app was at fault, I thought it wise to add x-ray for monitoring and even added my own subsegmentations to see if specific areas of code were to blame; what I've learned is according to my data, each of these areas runs fine and is not the cause of latency.
Whenever I analyze calls that executed longer, or even in a recent case of 3.48s call that normally should take 300ms (see screenshot below), the evidence is always the same - the first row in x-ray has some large number (3s to 30s) and the details below when I expand are always in the range of milliseconds, and do not add up to the 3.6s, or 30s, etc.
I wanted to ask how to interpret this - whenever I see the overall call take that long, but the expanded trace details in milliseconds, would it simply mean my app is fine, but the actual overall call (network latency, worker processes, web server, etc.) are to blame?
I'm just trying to understand where I need to start looking if that makes sense.
IIS Server The server is an EC2 T2-medium on AWS, and traffic is negligible (perhaps 100 API calls a day). I have a single server due to the low load. My IIS maximum worker processes on the app pool is set to 7, queue length 1000, start mode AlwaysRunning. I've not done much by way of fine-tuning the server.
Thank you so much for your time and any guidance.