X-Ray Traces - Segment Timeline Duration and How to Interpret Results

Question

My IIS server has always experiencing strange, intermittent 20-30s lag spikes on fairly standard API calls in my C# app. I'll add the server details below in case anyone is interested

In order to see if my app was at fault, I thought it wise to add x-ray for monitoring and even added my own subsegmentations to see if specific areas of code were to blame; what I've learned is according to my data, each of these areas runs fine and is not the cause of latency.

Whenever I analyze calls that executed longer, or even in a recent case of 3.48s call that normally should take 300ms (see screenshot below), the evidence is always the same - the first row in x-ray has some large number (3s to 30s) and the details below when I expand are always in the range of milliseconds, and do not add up to the 3.6s, or 30s, etc.

I wanted to ask how to interpret this - whenever I see the overall call take that long, but the expanded trace details in milliseconds, would it simply mean my app is fine, but the actual overall call (network latency, worker processes, web server, etc.) are to blame?

I'm just trying to understand where I need to start looking if that makes sense.

IIS Server The server is an EC2 T2-medium on AWS, and traffic is negligible (perhaps 100 API calls a day). I have a single server due to the low load. My IIS maximum worker processes on the app pool is set to 7, queue length 1000, start mode AlwaysRunning. I've not done much by way of fine-tuning the server.

Thank you so much for your time and any guidance.

It looks like the time is spent on work that happens before the first sub-segment is recorded. Have you checked if your static initialisation logic has any bits that could spike (ex: making network calls, stale connections, etc). Or perhaps you have some middleware chain that can cause this? — Tofig Hasanov, Jul 28 '23 at 02:16
This is a really good point - you're saying maybe look at the initialization methods and code that happens when things bootstrap... You know, I have no idea why I did not think about that, but that is fantastic advice, thank you very much. Any tips? Just use the SDK and create/end segmentations (and create subsegmentations) to dig in? — NullHypothesis, Jul 28 '23 at 02:26
You can try adding X-Ray segments earlier in your application cycle. I am not a C# expert, but I also assume there probably exist C# specific tools to collect profiling data from your application. — Tofig Hasanov, Jul 28 '23 at 02:31

score 1 · Accepted Answer · answered Aug 07 '23 at 18:21

You can look at the raw trace data in the console to see the start and end time of each of the subsegments. This can guide you to where you should focus. If there is a large gap in the start time of the segment and the first subsegment, then you should focus on the start of your code represented by the segment.

Click on a trace in the AWS X-Ray console, then click "Raw Data" in the top right.

You should see the "start_time" and "end_time" of the segments and subsegments.

X-Ray Traces - Segment Timeline Duration and How to Interpret Results

1 Answers1