I would like to have an efficient way of calculating the (approximate) count of a recurring event over a given time frame.
Example: I am trying to repeatedly download a file from a host. It usually works fine, but sometimes an error happens when the network is congested. I don't care about these single errors. Every once in a while though, the host is offline, so all my attempts fail. In that case I would like to automatically stop my program from trying again.
So I need to find out how many errors occured over the last x minutes. When the number is below a certain threshold, nothing happens. When it is above, I want to take an action. The count does not have to be 100% accurate, only accurate enough to tell me whether the threshold was reached.
A simple, yet ineffective (O(n)
), way of doing this would be to just store the timestamps of the events, and then for every new event determine the number of previous events by iterating over them and comparing the timestamps (up until the time frame is reached). [aside] I imagine this is what SQL engines do for a WHERE timestamp BETWEEN NOW() AND INTERVAL X MINUTES
, unless they have an index on the column. [/aside]
I want a solution with a constant (O(1)
) complexity. So far I am thinking that I will keep a counter of the event that increases by 1 with every event. I will also store the timestamp of the most recent occurance. Then, when a new event happens, by some math magic I can decrease the counter using the current time and the stored timestamp to reflect approximately how many events happened over the last x minutes.
Unfortunately my math skills are not up to the task. Can someone provide some hints?