Alternative methods of computing real-time event frequency

Question

I'm working in an application where there is a lot of reporting of external events. One of the metrics that is used often is the event rate as a function of time. For example, measuring the sample-rate of some external asynchronous sensor.

Currently the way I'm calculating the frequency of events like this is to keep a queue of event timestamps. When the event occurs we push a current timestamp onto the queue, then pop until the oldest timestamp is less than a predefined age. Then, the event frequency is proportional to the size of the queue. In pseudo-code the method usually looks something like this:

def on_event():
    var now = current_time()
    time_queue.push(now)

    while((now - time_queue.front()) > QUEUE_DEPTH_SECONDS):
        time_queue.pop()

    frequency = time_queue.size() / QUEUE_DEPTH_SECONDS

Now this approach is obviously not optimal:

Memory requirement and computation time is proportional to event rate.
The queue duration has to be manually adjusted based on the expected data rate in order to tune low-frequency performance vs memory requirements.
The response-time of the frequency measurement is also dependent on the queue duration. Longer durations lower the response time of the calculation.
The frequency is only updated when a new event occurs. If the events stop occurring, then the frequency measurement will remain at the value calculated when the last event was received.

I'm curious if there are any alternative algorithms that can be used to calculate the rate of an event, and what trade-offs they have in relation to computational complexity, space requirements, response-time, etc.

score 2 · Answer 1 · answered Oct 16 '19 at 04:12

https://en.wikipedia.org/wiki/Exponential_smoothing is very efficient and uses only a small and bounded amount of memory. You could try exponential smoothing of the inter-arrival times. When retrieving the smoothed inter-arrival time you could look at the time to the last event, and mix that in if it is larger than the smoothed inter-arrival time.

This is different enough that I would in fact start by collecting a sample of timestamps in current use, so that I could use it to test the result of this or other schemes off-line.

score 1 · Answer 2 · answered Oct 15 '19 at 21:25

One alternative is a local timer that fires at a constant rate (e.g. once per second):

When an external event occurs, it increments a counter.
When the local timer fires, the counter value is added to a queue, and the count is reset to zero.

Here's how the method compares to yours:

Memory requirement and computation time is independent of the external event rate. It is determined by the local timer rate, which you control.
The queue size depends on how much averaging you want to do. A queue size of 1 (i.e. no queue) with a timer rate of once per second results in a raw events-per-second reading, with no averaging. The larger the queue, the more averaging you get.
The response time is determined by the amount of averaging desired. More averaging results in slower response times.
The frequency is updated at the rate that the local timer fires, regardless of whether external events occur.

score 0 · Answer 3 · answered Jul 02 '21 at 13:36

I've implemented something similar to calculate the rate/concentration of particles in an air stream. They come at random events (probably poisson distributed) and I want to know the average rate in particles per time. My approach is as follows (taken from the docstring of my code):

Event timestamps are placed in a buffer of a fixed maximum size. Whenever a concentration estimate is requested, all timestamps up to a set maximum age are filtered out. If the remaining number of recent timestamps is below some threshold, the reported concentration is zero. Otherwise, the concentration is calculated as the number of remaining events divided by the time difference between the oldest timestamp in the buffer and now.

I'll attach the Python implementation below for reference. It is a part of a much larger project, so I had to slightly modify it to get rid of references to external code:

particle_concentration_estimator_params.py

#!/usr/bin/env python3

"""This module implements a definition of parameters for the particle
concentration estimator.
"""

__author__    = "bup"
__email__     = "bup@swisens.ch"
__copyright__ = "Copyright 2021, Swisens AG"
__license__   = "GNU General Public License Version 3 (GPLv3)"

__all__ = ['ParticleConcentrationEstimatorParams']


import dataclasses


@dataclasses.dataclass
class ParticleConcentrationEstimatorParams:
    """This provides storage for the parameters used for the particle
    concentration estimator.
    """

    timestamp_buffer_max_size: int
    """Maximum size of the buffer that is used to keep track of event
    timestamps. The size of this buffer mainly affects the filtering
    of the reported data.

    Unit: - (count)
    """

    timestamp_buffer_max_age: float
    """Maximum age of events in the timestamp buffer which are
    considered for the concentration calculation. This value is a
    tradeoff between a smooth filtered value and the dynamic response
    to a changed concentration.

    Unit: s
    """

    min_number_of_timestamps: int
    """Minimum number of timestamps to use for the concentration
    estimation. If less timestamps are available, the concentration is
    reported as zero.

    Unit: - (count)
    """

particle_concentration_estimator.py

#!/usr/bin/env python3

"""This module implements the particle concentration estimation.
"""

__author__    = "bup"
__email__     = "bup@swisens.ch"
__copyright__ = "Copyright 2021, Swisens AG"
__license__   = "GNU General Public License Version 3 (GPLv3)"

__all__ = ['ParticleConcentrationEstimator']


import logging
import time
from typing import Optional

import numpy as np

from .particle_concentration_estimator_params import ParticleConcentrationEstimatorParams


logger = logging.getLogger(__name__)


class ParticleConcentrationEstimator:
    """An object of this class implements the Poleno particle
    concentration estimator. Particle concentration is basically just
    a number of particles per time unit. But since the particle events
    arrive irregularly, there are various ways to filter the result, to
    avoid too much noise especially when the concentration is low. This
    class implements the following approach:

    Event timestamps are placed in a buffer of a fixed maximum size.
    Whenever a concentration estimate is requested, all timestamps up to
    a set maximum age are filtered out. If the remaining number of
    recent timestamps is below some threshold, the reported
    concentration is zero. Otherwise, the concentration is calculated
    as the number of remaining events divided by the time difference
    between the oldest timestamp in the buffer and now.
    """

    def __init__(self, params: ParticleConcentrationEstimatorParams):
        """Initializes the object with no events.

        Args:
            est_params: Initialized PolenoParams object which includes
                information describing the estimator's behaviour.
        """
        self.params = params
        """Stored params for the object."""
        n_rows = self.params.timestamp_buffer_max_size
        self._rb = np.full((n_rows, 2), -1e12)
        self._rb_wp = 0
        self._num_timestamps = 0
        self._concentration_value = 0.0
        self._concentration_value_no_mult = 0.0

    def tick(self, now: float) -> float:
        """Recalculates the current concentration value.
        
        Args:
            now: Current timestamp to use to filter out old entries
                in the buffer.

        Returns:
            The updated concentration value, which is also returned
            using the concentration attribute.
        """
        min_ts = now - self.params.timestamp_buffer_max_age
        min_num = self.params.min_number_of_timestamps
        used_rows = self._rb[:, 0] >= min_ts
        filt_ts = self._rb[used_rows]
        num_ts = round(np.sum(filt_ts[:, 1]))
        self._num_timestamps = num_ts
        num_ts_no_mult = round(np.sum(used_rows))
        if num_ts < min_num:
            self._concentration_value = 0.0
            self._concentration_value_no_mult = 0.0
        else:
            t_diff = now - np.min(filt_ts[:, 0])
            if t_diff >= 1e-3:
                # Do not change the reported value if all events in the
                # buffer have the same timestamp.
                self._concentration_value = num_ts / t_diff
                self._concentration_value_no_mult = num_ts_no_mult / t_diff
        return self._concentration_value

    def got_timestamp(self,
                      ts: Optional[float] = None,
                      multiplier: float = 1.0) -> None:
        """Passes in the most recent event timestamp. Timestamps need
        not be ordered.

        Calling this method does not immediately update the
        concentration value, this is deferred to the tick() method.

        Args:
            ts: Event timestamp to use. If None, the current time is
                used.
            multiplier: Optional multiplier, which makes ts to be
                counted that many times.
        """
        if ts is None:
            ts = time.time()
        self._rb[self._rb_wp] = (ts, float(multiplier))
        self._rb_wp = (self._rb_wp + 1) % self._rb.shape[0]
        self._num_timestamps += round(multiplier)

    @property
    def concentration(self) -> float:
        """The calculated concentration value.

        Unit: 1/s
        """
        return self._concentration_value

    @property
    def concentration_no_mult(self) -> float:
        """The calculated concentration value without taking into
        account the timestamp multipliers, i.e. as if all timestamps
        were given with the default multiplier of 1.

        Unit: 1/s
        """
        return self._concentration_value_no_mult

    @property
    def num_timestamps(self) -> int:
        """Returns the number of timestamps which currently are in
        the internal buffer.
        """
        return self._num_timestamps

Alternative methods of computing real-time event frequency

3 Answers3