5

We are currently monitoring a web API using the data in the Performance page of Application Insights, to give us the number of requests received per operation.

The architecture of our API solution is to use APIM as the frontend and an App Service as the backend. Both instances have App Insights enabled, and we don't see a reasonable correlation between the number of requests to APIM and the requests to the App Service. Also, this is most noticeable only in a couple of operations.

For example, Apim-GetUsers operation has a count of 60,000 requests per day (APIM's AI instance)

APIM App Insights Performance Page

AS-GetUsers operation has a count of 3,000,000 requests per day (App Service's AI instance)

App Service App Insights Performance Page

Apim-GetUsers routes the request to AS-GetUsers and Apim-GetUsers is the only operation that can call AS-GetUsers.

Given this, I would expect to see ~60,000 requests on the App Service's AI performance page for that operation, instead we see that huge number.

I looked into this issue a little bit and found out about sampling and that some App Insights features use the itemCount property to find the exact number of requests. In summary,

  • Is my expectation correct, and if so what could cause this? Also, would disabling adaptive sampling and using a fixed sampling rate give me the expected result?

  • Is my expectation wrong, and if so, what is a good way to get the expected result? Should I not use the Performance page for that metric?

Haven't tried a whole lot yet as I don't have access to play with the settings until I can find a viable solution, but I looked into sampling and itemCount property as mentioned above. APIM sampling is set to 100%.

I ran a query in Log Analytics on the requests table and when I just used the requests count, I got a number that was closer to the one I see in APIM, but when I use a sum of the itemCount, as suggested by some MS docs, I get that huge number as seen in the performance page.

List of NuGet packages and version that you are using:

  • Microsoft.Extensions.Logging.ApplicationInsights 2.14.0
  • Microsoft.ApplicationInsights.AspNetCore 2.14.0

Runtime version (e.g. net461, net48, netcoreapp2.1, netcoreapp3.1, etc. You can find this information from the *.csproj file):

  • netcoreapp3.1

Hosting environment (e.g. Azure Web App, App Service on Linux, Windows, Ubuntu, etc.):

  • App Service on Windows

Edit 1: Picture of operation_Id and itemCount

enter image description here

DanielM
  • 51
  • 3
  • 2
    There should not be a difference between adaptive sampling and fixed sampling here. Do you see different operation_Id values in requests table? The way sampling works is that SDK hashes operation_id and then compares with sampling threshold (whether it is adaptive or fixed). One artefact I saw when operation_Id was reused is that every such telemetry item gets the same itemCount but in reality either all of them are in or out. Let's rule it out first. – ZakiMa Aug 02 '21 at 20:11
  • Thanks for that insight @ZakiMa , although I am not sure I understood you correctly. I ran a query like "requests | where url contains "users"" and all the operation_Id values are different. But the itemCount for a lot of operations were in the hundreds, but the count was the same for a lot of operations, is that what you were referring to? I added a pic to the post of what I see. – DanielM Aug 03 '21 at 07:39
  • The fact that every request got a unique operation_Id rules out the artefact due to reusing operation identifiers. So, it seems we're good here. QQ - is this telemetry submitted using auto instrumentation (i.e. Application Insights SDK was added but there are not changes in a pipeline like TelemetryProcessor/TelemetryInitializer)? – ZakiMa Aug 03 '21 at 21:27
  • We have this one line in the startup file services.AddApplicationInsightsTelemetry();. And we use the ILogger to log any errors or warning when something fails. We haven't written any custom telemetry initializers or processors. – DanielM Aug 04 '21 at 02:37

0 Answers0