analyzing multiple dependent variables (different species same predictors) in a gam model using dlply

Question

I am still in the learning curve of R (and new in this forum) and have been trying to loop a gam model where I tried to evaluate different responses (dependent variables - in this case fish species) to environmental predictors. So basically I want to recreate the following in a loop:

g1<-gam(var1~s(x1)+s(x2))

g2<-gam(var2~s(x1)+s(x2))

g3<-gam(var2~s(x1)+s(x2))

My data frame consisted of the abundance of each species in separate columns followed by the environmental predictors in separate columns. I followed some suggestions in this forum using melt and dlply to iterate the gam model by species. I used the below code:

melt.x<-melt(FullMatrix, id=c("PLAND_50","FragIndex_50"),measure.var=c("CALLI_SPP","CARIDEA",  "EUC_ARG","FAR_DUO","HAE_SCI","LAG_RHO","LUC_PAR","LUT_GRI","OPS_BET","SPH_BAR","SYN_SPP"), variable.name="Nekton", value.name="Abundance")

.The above code created a data.frame with one row for each species observation

attach(melt.x)

gams_50<-dlply(melt.x, .var=c("Nekton"), .fun=function(x){
gam(scale(Abundance)~s(PLAND_50,bs="cr")+s(FragIndex_50,bs="cr")+te(PLAND_50,FragIndex_50), gamma=1.4)
})

lapply(gams_50, summary)

The code run without errors, however, when I inspected the results I noticed that the results were similar across the species (i.e., same adjusted R, variables p-value, GCV score, etc). Anybody have ideas how to run multiple gams? or identify the source of error?

All ideas are welcome and appreciated! Hopefully I was not vague. Please let me know if you need any further information.

Best, Rolo

score 0 · Answer 1 · answered May 07 '14 at 16:25

I am not sure why it is not working. But I offer you 2 possible solutions with a similar data frame that I have created. Probably there are more elegant solutions, but these work:

library (plyr) 
library(mgcv)

## Creation of an artificial data frame

sp_a  <- rnorm (100, mean = 3, sd = 0.9)

sp_b  <- rnorm (100, mean = 7, sd = 2.3)


var_env_1 <- rnorm (100, mean = 1, sd = 0.3)

var_env_2 <- rnorm (100, mean = 5, sd = 1.6)


data <- data.frame (sp_a, sp_b, var_env_1, var_env_2  ) 

## solution 1

data_2 <- melt.data.frame (data, measure.vars = c("sp_a", "sp_b"), variable_name = "Species") 

dlply ( data_2, .(Species), function (x) { summary (gam(value ~ s (var_env_1 ) + s (var_env_2), data = x  ) ) }  )

## solution 2

## Loop

for (i in names (data)[1:2] ) {

  assign ( paste ("m",i,sep = "_") , gam ( get (i) ~ s (var_env_1 ) + s (var_env_2 ) , data = data))

  print( summary ( get (paste ("m",i,sep = "_") ) ) )

}

# The last print is equivalent to do this:

summary(m_sp_a)

summary(m_sp_b)

analyzing multiple dependent variables (different species same predictors) in a gam model using dlply

1 Answers1