0

I have two for loops within a loop. I came up with this to fill a list of a list of matrices. My list of matrices are defined below (I have seven lists of each a list of 1000 matrices). I have seven variables (ncol=7) for which I have seven different calculations (h2,cvm,cve,mean,wfv,bfv). These calculations I have to do 1000 times with the dataframe randomly sampled each time, hence 1000 matrices. This loop works, I have tried it with a list of 100 matrices, however, 10000 matrices takes longer than 1 1/2 days ( I cancelled it as I couldn't wait longer. Do you have some good advice on how I could speed up this loop? step by step discription of what happens in each loop is next to the code. Many thanks!

 list.h2 <- rep(list(matrix(nrow = length(levels(h2$spec.plot)), ncol = 7)),1000) 
 list.cvm <- rep(list(matrix(nrow = length(levels(h2$spec.plot)), ncol = 7)),1000)
 list.cve <- rep(list(matrix(nrow = length(levels(h2$spec.plot)), ncol = 7)),1000)
 list.mean <- rep(list(matrix(nrow = length(levels(h2$spec.plot)), ncol = 7)),1000)
 list.wfv<-rep(list(matrix(nrow = length(levels(h2$spec.plot)), ncol = 7)),1000)
 list.bfv<-rep(list(matrix(nrow = length(levels(h2$spec.plot)), ncol = 7)),1000)

 ## bind everything into a list of lists
 res.list <- list(list.h2, list.cvm, list.mean, list.cve, list.wfv, list.bfv)
  names(res.list) <- c("h2", "cvm", "mean", "cve", "wfv", "bfv")

 require(plyr)
 library(lme4) 
 for(f in 1:1000){                      # 1000 runs
 dd<- ddply(h, .(spec_name),summarize, ans=sample(spec_sf)) #sample the dataframe 1000 times
 names(dd)<-c("species","m_spec_sf")
 h_dd1<-cbind(h,dd["m_spec_sf"])       #bind the new sampled dataframe to the original dataframe 'h'
 h_dd<-h_dd1[c(1:12,21,14:20)]

 for(j in 1:7){            # 7 is the number of response variables which i want to do the below calculations on

 for(i in 1:length(levels(spec.plot))){  # the levels I want to do the below calculations within

  sub.mat <- h_dd[spec.plot==levels(spec.plot)[i], ]    # create matrix with values for only one spec x plot combination
  means <- tapply(sub.mat[, j+13], as.character(sub.mat$m_spec_sf), mean, na.rm=T)    

  ### means   
  res.list[["mean"]][[f]][i,j] <- mean(sub.mat[,j+13], na.rm=T)

  ### model: use try to stop the whole loop breaking if the model doesn't fit
  m<-try(lmer(h_dd[, j+13]~1+(1|m_spec_sf), subset=spec.plot==levels(spec.plot)[i],data=h_dd))

  if(class(m)=="try-error"){     ## remove failed models
    res.list[["h2"]][[f]][i,j] <- NA}
  else{ 
    VCg<-as.numeric(VarCorr(m))     #insert all results from below calculations into the list of matrices
    VCe<-attr(VarCorr(m),"sc")^2
    res.list[["h2"]][[f]][i,j] <- 4(VCg/(VCg+VCe))  
    res.list[["cve"]][[f]][i,j] <- sqrt(VCe)/mean(means, na.rm=T)  
    res.list[["cvm"]][[f]][i,j] <- sqrt(VCg)/mean(means, na.rm=T)  
    res.list[["bfv"]][[f]][i,j] <- VCg       
    res.list[["wfv"]][[f]][i,j] <- VCe      
  }
}
}
}
j_b
  • 35
  • 6
  • 1
    First order of business is profiling your code. However, apparently you are trying to fit tenthousands of `lmer` models. You shouldn't be surprised if that is slow. – Roland Sep 04 '14 at 13:38
  • @Roland Sorry, what do you mean by profiling the code? Is it simply the lmer that is making it slow? Maybe there is a way to improve the loop to make it faster? – j_b Sep 04 '14 at 13:45
  • 1
    Stufy `help("Rprof")` and it's not the loops that are slow, but what you do inside them. I was just pointing out that `lmer` is relatively slow and not designed to be called that often. There might be some other slow operations, which could be improved, but it's easier to find them with profiling than by us trying to understand your not-reproducible code. – Roland Sep 04 '14 at 13:51
  • @Roland will do! thanks for your comments! – j_b Sep 04 '14 at 15:02

0 Answers0