I am trying to construct a custom metric for Google StackDriver that I can use to track nodejs event loop latencies. All the apps are running in Google AppEngine so I am confined to using the monitored resource global
(at least to my understanding).
Via the nodejs @google/monitoring
client I have created a metric descriptor looking like this:
{
name: client.projectPath(projectId),
metricDescriptor: {
description: 'Nodejs event loop latency',
displayName: 'Event Loop Latency',
type: 'custom.googleapis.com/nodejs/eventloop/latency',
metricKind: 'GAUGE',
valueType: 'DOUBLE',
unit: '{ms}',
labels: [
{
key: 'instance_id',
valueType: 'STRING',
description: 'The ID of the instance reporting latency (containerId, vmId, etc.)',
},
],
},
And writing data to this custom metric like:
metric: {
type: 'custom.googleapis.com/nodejs/eventloop/latency',
labels: {
instance_id: instanceId,
},
},
resource: {
type: 'global',
labels: {
project_id: projectId,
},
},
points: [{
interval: {
endTime: {
seconds: item.at,
},
},
value: {
doubleValue: item.value,
},
}],
};
I thought all was good while writing my tests, until I tried changing my instance_id
to write data that was in an overlapping timespan as another fake instance had already written. Now the monitor client throws the error
Error: One or more TimeSeries could not be written:
Points must be written in order. One or more of the points specified was older than the most recent stored point.
Which renders my custom metric VERY useless, only one nodejs process can ever write to this custom metric.
Now my question is, how can I circumvent this? I want to be able to write from all of my nodejs instances running (x
AppEngine services with y
instances running).
I was thinking a type
that is indexed on nodejs/eventloop/latency/{serviceName}/{serviceVersion}/{instanceId}
but it seems a bit extreme and will quickly bring me towards the quotas on the StackDriver account.
Any suggestions are highly appreciated!