I have a vertically arranged (stacked) pooled time series data.frame
that looks like this:
date item qty_sold
day_1 orange 0
day_2 orange 0
day_3 orange 0
day_4 orange 0
day_5 orange 5
day_6 orange 0
day_7 orange 8
day_8 orange 0
day_1 hammer 0
day_2 hammer 0
day_3 hammer 3
day_4 hammer 0
day_5 hammer 70
day_6 hammer 70
day_7 hammer 0
Day_8 hammer 80
In each "item's" sub-series/sub-group, I need to identify and remove *all observations prior to the day on which the first positive qty_sold was observed*. For example, for the "orange" series, this means striking out days 1 through 4 and for the "hammer" series this means striking out the first 2 days.
(In case the explanation above is not clear): From each sub-series in the dataset, I need to to remove the all the days from date = Day_1 to date = Day_k, such that for each day in the interval 1...k qty_sold = 0, AND retain all rows where date = Day_k+1 qty_sold >= 0)
Can anyone kindly give an idea on how to go about this? The actual dataset contains about a million rows. I would also welcome suggestions in accomplishing this using SAS apart from R.