In periods we are experiencing many "Request timed out" exceptions (System.Web.HttpException) from a specific endpoint that is called often.
It appears not to be related to high-peak periods and has been experienced right after deployment and at random times. No pattern.
The solution is not to increase the execution timeout as the requests are normally completed within seconds.
Neither the web server nor the backend SQL Server is stressed. We have even seen low CPU usage during an incident period.
From ApplicationInsights I got the exact endpoint failing, which is a standard controller action. However, there is no additional information. No stack trace. No error code. Nothing. The exception is thrown at any time between 1 second and minutes after the request start.
From ApplicationInsight I can see that some of the requests to the failing endpoint are completed. However, the response time is extremely long (up to 8 minutes).
I have found nothing in the IIS logs. We have set up the failed request logging and waiting for the next incident. However, we do not expect to get more information than we already got from ApplicationInsights.
I'm uncertain whether this is an ASP.NET MVC application issue or an IIS configuration. It puzzles me, that no stack trace is available.
Any suggestions on how to approach this challenge? Pointers to articles/blogs that can help me solve the issue are very much appreciated.
UPDATE
I was looking through our trace logs and realized that they were not complete, i.e., entries were missing. We use ApplicationInsights (AI) for tracing. AI is configured to keep all traces, exceptions, and events, and it is working flawlessly in DEV and STAGING.
We have two AI environments: AI-PROD and AI-TEST. The environment is selected in web.config via instrumentation key. The entire AI config is in the ApplicationInsights.config and this file is the same in DEV, STAGING, and PROD.
I tried to connect STAGING to the AI-PROD environment to verify that it was not a problem with the environment. It worked flawlessly.
I disabled AI in PROD and the server started without throwing “Request timed out” errors during startup. When PROD is connected to either the AI-PROD or the AI-TEST environment I get “Request timed out” errors during startup.