I would like to resample a large data set with an unequal number of observations across the range in the data so that each range has an equal number of observations.
It seems like rollapply would be the way to do this, but it doesn't appear that it can be convinced to define its rolling window based on the data values?
For example:
set.seed(12345)
z <- sort(rnorm(100,100,40))
rollapply(z, 20, function(x){sample(x,20,replace=TRUE)}, by=20)
This does a great job of taking a list of numbers and resampling it every 20 numbers, however, I would like it to start at the lowest value and resample within a regular bin of values. For the above example the (left edge) bins could be defined like:
(0:10)*(max(z)-min(z))/10+min(z)
I know I could write a for loop and do this, but I am looking for a faster / simpler method.
An input vector with unequal distribution of observations between the ranges 1:10 and 11:20: c( 1, 2, 2, 3, 3, 3, 5, 6, 7, 11, 13, 13, 20) Resampled 5 times at 2 intervals of 10 units (i.e from 1:10 and 11:20)each interval sampled 5 times could produce:
c( 3, 1, 7, 3, 2, 11,20,11,13,20)