0

I need to conduct an analyses where we need to do apply a GBM algorithm onto a series of bootstrapped replicates. Another wrinkle is that each replicate needs to have a quantile normalized outcome. What I am trying to eventually achieve is 1. Start with main data set with 2. Create a 3-dimensional array that contains 200 resamples 3. Quantile normalize the outcome variable within each resample 4. Run a GBM in all samples

Right now, I can't even get to the resampling step.

#generating some data    
main<-matrix(
  replicate(52,rnorm(1132)),
  ncol=52,
  nrow=1132,
  dimnames = list(
    1:1132,
    1:52)
)

colnames(main)[1]<-"outcome"

#trying to create 200 resampled replicates
resampled = array (
  rep(NA),
  dim= c(1000, ncol(main), 200),
  dimnames= list(
      1:1000,
      colnames(main),
      1:200
      )
   ) 


  for (i in 1:dim(resampled)[1]) {
    for (j in 1:dim(resampled)[2]) {
      for (k in 1:dim(resampled)[3]) {
        resampled[i,j,k]= main[sample(nrow(main), size=1000, replace=TRUE),]

  }
}}

I'm pretty sure it's because I'm not specifying the loop correctly, but after weeks of searching, I can't find exemplar code that will help me.

I currently get an error message: Error in resampled[i, j, k] = main[sample(nrow(main), size = 1000, replace = TRUE), : number of items to replace is not a multiple of replacement length

  • 2
    Can you ask a minimal reproducible question with a small dataset and expected results? – shayaa Oct 05 '16 at 23:27
  • When you make a minimal reproducible example, It would help to have your expected outcome and the actual outcome you get. – pdb Oct 06 '16 at 01:12
  • I'm on it. This is all very new to me. Will update later tonight when I figure out how to get a reproducible example. – BobaAddict Oct 06 '16 at 01:48

1 Answers1

0

The problem in your loop is that resampled[i,j,k] expects to receive one single element but main[sample(nrow(main), size=1000, replace=TRUE),] returns a 1000x52 matrix.

I've made a smaller example based on yours. Try the following code and see if this is what you expect to get as result:

ncol = 3
nrow = 10
sample.size = 5
sample.rep = 4

#generating some data    
main<-matrix(
  replicate(ncol,rnorm(nrow)),
  ncol=ncol,
  nrow=nrow,
  dimnames = list(
    1:nrow,
    1:ncol)
)

colnames(main)[1]<-"outcome"

#trying to create 'sample.rep' resampled replicates
resampled = array (
  rep(as.numeric(NA)),
  dim= c(sample.size, ncol(main), sample.rep),
  dimnames= list(
    1:sample.size,
    colnames(main),
    1:sample.rep
  )
) 

for (k in 1:dim(resampled)[3]) {
  resampled[,,k]= main[sample(nrow(main), size=sample.size, replace=TRUE),]
}
print(resampled)
Gabriel Mota
  • 302
  • 1
  • 10
  • It is now giving me a message about not having the right number of subscripts, but my array has also turned into one absurdly large object. I tried for a bit to find a way to divide up the giant object back up into an array. I'm going to try a different approach to this since there are so many more steps after creating the array that have to happen as well. Thanks for your help! – BobaAddict Oct 09 '16 at 06:26