0

I'm trying to get the confidence intervals for the detection functions created using GAM's. I've gotten as far as creating the GAM with factors and variables

model = gam(Capture>0~s(Distance)+s(Amplitude)+DetectorNumber, 
              family=binomial, data=dat)

See Figure Here:https://drive.google.com/file/d/0BxC5badRi-zjT2lHQVFSUlAydW8/view?usp=sharing

Now I need the confidence intervals which, in my world, would be the integration of the upper and the lower shaded areas. However, even if I were able to figure out how to do the integration I believe that relies on the assumption that the variance is constant. In this case, we would expect more error in the measurements as the distance value increases (measurement error).

We also need to break out each channel and know the detection probability (with CI's) for each detector number. Again, I've had some success using the predict values but haven't succeeded in getting confidence intervals.

  sim=data.frame(Dist=seq(from=0, to=39.99, by=.01),
                     Amplitude=sample(Amplitude,4000),
                     DetectorNumber=rep(ii,4000));

  sim$DetectorNumber=factor(sim$DetectorNumber)
  yy=predict(model, newdata=sim, type='response')
  pdet=sum(yy*.01)/(4000*.01)

I'm starting to lean towards bootstrapping to get the detection probability and CI values, but I'm not sure how to approach that using a GAM.

Your thoughts would be most appreciated.

Incidentally, if anyone knows how to force the end values of the GAM to 0 towards the tail, that would be helpful as well (e.g. the probability of detecting a bird at 40km is 0)

Siguza
  • 21,155
  • 6
  • 52
  • 89
K.J. Palmer
  • 145
  • 3
  • 9
  • 1
    I get the sense that your first stop should not be coding forum, but rather a place where methodologists hang out. The two such places that come to mine are CrossValidated.com and the R-SIG-geo mailing list. I also found the word "integration being used in what aI though was a very non-mathematical sense so I would suggest clarifying exactly what was meant by that term. – IRTFM Mar 23 '15 at 21:59
  • I don't know what the kids are calling it, whatever 'it' is. (I'm eligible for Medicare.) You talk about "shaded areas" but that must be some sort of graphic output and not a mathematical entity. Are you saying the area or volume between prediction limits has some sort of meaning in your domain of science? Seems strange, so please feel free to put a hyperlink in an edit to your question showing this process being performed and interpreted. (And the word is "sites".) – IRTFM Mar 23 '15 at 22:13
  • In thinking a bit more about what you might be imagining it occurred to me that you were talking about some sort of kernel density estimates, possibly multidimensional ones. Again I say .... get thee to a statistician. – IRTFM Mar 23 '15 at 22:36

0 Answers0