I have a panel data, i.e. t
rows for each of n
observations (nxt
), such as
data("Grunfeld", package="plm")
head(Grunfeld)
firm year inv value capital
1 1935 317.6 3078.5 2.8
1 1936 391.8 4661.7 52.6
1 1937 410.6 5387.1 156.9
2 1935 257.7 2792.2 209.2
2 1936 330.8 4313.2 203.4
2 1937 461.2 4643.9 207.2
I want to make block bootstrapping, i.e. I want resample with replacement, taking a firm [i] with all the years in which it is observed. For instance, if year=1935:1937
and firm 1 is randomly drawn, I want that firm [1] will be in the new sample 3 times, corresponding to year=1935:1937
. If it is re-drawn, then it must be again 3 for times. Furthermore, I need to apply my own function to the new bootstrapped sample and I need to do this 500 times.
My current code is something like this:
library(boot)
boot.fun <- function(data) {
est.boot = myfunction(y=Grunfeld$v1, x=Grunfeld$v2, other parameters)
return(est.boot)
}
boot.sim <- function(data, mle) {
data = sample(data, ?? ) #
return(data)
}
start.time = Sys.time()
result.boot <- boot(Grunfeld, myfunction( ... ), R=500, sim = "parametric",
ran.gen = boot.sim)
Sys.time() - start.time
I was thinking to resample by specifying in a correct way data = sample(data, ?? )
as it works smooth and clean, using as index the column firm
. How could I do that? Is there any other more efficient alternative?
EDIT.
I do not necessarily need a new boot.function. I just need a (possibly fast) code which allows to resample with replacement, then I ll put it inside the boot
argument as ran.gen=code.which.works
.
The output should be a sample of the same dimension of the original, even though firms can be randomly picked twice or more (or not be picked). For instance the result could be
head(GrunfeldResampled)
firm year inv value capital
2 1935 257.7 2792.2 209.2
2 1936 330.8 4313.2 203.4
2 1937 461.2 4643.9 207.2
1 1935 317.6 3078.5 2.8
1 1936 391.8 4661.7 52.6
1 1937 410.6 5387.1 156.9
2 1935 257.7 2792.2 209.2
2 1936 330.8 4313.2 203.4
2 1937 461.2 4643.9 207.2
9 1935 317.6 3078.5 122.8
9 1936 391.8 4661.7 342.6
9 1937 410.6 5387.1 156.9
Basically I need each firm treated as a block
, and therefore the resampling should apply to the whole block. Hope this clarifies