Lets say we maintain a website in which all requests are recorded. How to determine the number of requests made in the last 5 minutes at any point in time?
I could get to the solution for 5 mins. But not sure how to make it generic for any time interval.
My approach:
We maintain a array of size 300. We maintain a pointer in the array which represents the current index(which increments itself every second). Whenever a request is made, we just return the value at which the pointer is referring to.
To first populate the array, all values are cumulative. for instance, no of requests made in 1st sec is 3, in 2nd sec is 5, in 3rd sec is 0... then the array looks like
3, 8, 8, 0...., 0 , where the pointer points to index no 2.
(Lets fast forward to 4:59 mins and the contents of the array is)
3, 8, 8,....,180, 0
where the ptr refers to index 298 as we haven't populated the 299th index.
now lets say the no of requests recorded for the next 2 seconds are 5 and 2. Array looks like:
3, 8, 8, ............, 180, 185 (updated at 5:00)
(185+2-3(oldvalue)), 8, 8, ............, 180, 185 => 184, 8, 8, ............, 180, 185 (updated at 5:01)
The ptr refers to 0th index. So as of now, the no of requests made in last 5 mins is 184.
On similar lines, we should be able to return the value at any point in time, in O(1).
But how to make the solution generic? In the sense that what if the time period is arbitrary, like find no of request in last 10 mins, in last 20 mins, in last 1 min. I thought we could leverage segment trees, but we would end up modifying all the values for every sec which would be too costly. Coming up with a map reduce pgm would be a O(N) solution to trigger a pgm whenever a request for getRequestsinLastNMins() is made. But I am looking for something which could be done in O(1).