Trying to run a number of simple functions by creating a window of data in data table but struggling with mixing a fixed number of sequenced observations and a shifted date - not to be confused with any number of observations over a rolling date range, which most questions seem concerned with.
We need the last 30 observations grouped by name, before the day of the current observations, which may occur over varying numbers of days.
So Jimmy, John and Susan have all been practising tiddlywinks for the last year.
We want to know for each of them how many they have successfully flipped into the pot over their last 30 attempts (or shots), and for example the rolling average over the last 30 attempts. However, these 30 attempts should all be before the current date on which they are flipping, so we want to exclude any shots made today.
Thus a data table called Tiddlywinks could be a record of each shot, observing:
Player (the player's name) DateTime (the time and date of each flip, so ordering gives us the sequence of flips) Date (the date of each flip, so we can lag all observations by current date) Success (1 or 0, with 1 being a successful flip into the pot)
I can create rolling counts and therefore averages by date, no problem, and have found the R runner package useful in this regard, but it only seems to enable rolling date windows or sequences but not both mixed. Eg. if wanting to count how many successes each player had in the last 30 days before the current date, we could say:
library(data.table)
library(runner)
dataset[order(DateTime), rolling_wins_30 := sum_run(
x = Success,
k = 30,
idx = as.Date(Date),
lag=1
),
by=Player][is.na(rolling_wins_30), rolling_wins_30 := 0]
But we don't want to do this. We want to know (for example) how many successes a player had in their last 30 attempts (so a rolling window of 30 observations in strict sequence) before any attempts on the current date (so excluding the current date). All players vary in the amount of practice they do, so a rolling window based on flips (ie. observations, not dates) is needed, but it still needs to be in time order, and still needs to exclude all flips on the current day.
This would be useful if we wanted to evaluate the current form of all players turning up for a competition today, but based on their last 30 flips before today, for example.
Can't find anything that addresses the problem of a fixed rolling window of observations, all before a certain date.
Ideas?