1

We have cluster of instances whereas each instance has DropWizard metrics gatherer.

We're also trying to leverage AppDynamics custom metrics and that works so that custom script hits DropWizard exposed endpoint (/metrics) and sends metrics of interest to AppDynamics Controller.

AppDynamics has 2 cluster rollout strategies for how the metric is displayed in a whole application view (tier) - SUM and AVG.

While this works well for stuff like counts (sum is used) and average processing times (avg is used) - we for now don't have any idea of how to aggregate each instance percentiles exposed by DropWizard - neither sum nor avg looks correct.

Example:

instance1: p75=400
instance2: p75=600
instance3: p75=800

sum will give 1700 what of course isn't useful at all.

avg will give 600 - which isn't correct either - we're losing track of higher bound.

If AppDynamics had MAX Cluster rollout - that would be more or less fair - still not correct though. But AppDynamics doesn't have that.

We also understand that the only fully correct way of gathering cluster percentiles is to perform aggregation from all nodes at one place (e.g. logstash, etc..) and not on each instance. But for now that's what we have - just sending custom metrics periodically.

It would be great if anyone suggests something regarding that.

Thanks in advance,

oceansize
  • 623
  • 3
  • 16

0 Answers0