0

Given a series of events, is there an algorithm for determining if a certain number of events occur in a certain period of time? For example, given list of user logins, are there any thirty day periods that contain more than 10 logins?

I can come up with a few brute force ways to do this, just wondering if there is an algorithm or name for this kind of problem that I havent turned up with the usual google searching.

frugardc
  • 440
  • 1
  • 4
  • 13

1 Answers1

0

In general it is called binning. It is basically aggregating one variable (e.g. events) over an index (e.g. time) using count as a summary function.

Since you didn't provide data I'll just show a simple example:

# Start with a dataframe of dates and number of events
data <- data.frame(date=paste('2013', rep(1:12, each=20), rep(1:20, times=12), sep='-'),
                   logins=rpois(12*20, 5))

# Make sure to store dates as class Date, it can be useful for other purposes
data$date <- as.Date(data$date)

# Now bin it. This is just a dirty trick, exactly how you do it depends on what you want.
# Lets just sum the number of events for each month
data$month <- sub('-', '', substr(data$date, 6, 7))
aggregate(logins~month, data=data, sum, na.rm=TRUE)

Is that what you wanted?

Oscar de León
  • 2,331
  • 16
  • 18