2

Im trying to setup health check alerts for critical functionality across my site. So for things like registrations, payments and critical emails I have started logging custom event telemetry using the telemetry client like so:

var tc = new TelemetryClient();
tc.TrackEvent(emailType.ToString());

This is currently working great and im able to create an Application Insights analytics dashboard out of this data, which form the basis of my alerts.

From the portal i have now started creating alerts on which the criteria is a custom log search (Azure Portal > Application Insights > Alerts > Add New Rule > Add Criteria), shown below:

Custom Log Search

The problem is the period has a max length of 24 hours which means for an event that fires in-frequently (lets say once over the cource of a week). We would get false alerts on a daily basis.

Question is how can I setup alerting in application insights for events like these?

I prefer if the solution does not require additional webjobs or code crunching numbers to figure out if thresholds are not met, as i feel an alerting system should have as little moving parts as possible.

Update 1

After having contacted Microsoft's alert feedback group they have extended the period dropdown to 48 Hours, however this is still inadequate for my usecase.

I have tried seeking alternative tools like Grafana (with and app insights plugins). However sadly that particular plugin does not support alerting (whilst Grafana does).

Faesel Saeed
  • 199
  • 1
  • 15
  • Both Alerts and Alerts classic have a time horizon of max 24 hours in the past. You could probably use an external monitoring tool (like a time-triggered Azure function) that sets up a metric (or a formatted log value) by performing an aggregation or a query over the last week if that's what you're looking for. – Horia Toma Apr 25 '18 at 10:19
  • This is certainly my fallback (we are looking into elastic search elk stack, mixed with plugins for alerting), it feels really strange that this is not built into azure's alerting system, in my view it makes it kinda redundant and useless. I'm not keen on a time-triggered azure function as its another potential area we would have to monitor for failures (unless we start monitoring the monitoring tools :) ) The less moving parts in the chain the more confidence you have on an alerting system. – Faesel Saeed Apr 25 '18 at 10:37

0 Answers0