dividing data and fitting in R

Question

I've the following rainfall data for the year 1951:

dat.1951=c(122,122,122,122,122,122,122,122,122,122,122,121,121,121,121,120,119,119,117,117,117,115,115,115,114,112,112,111,110,109,106,105,104,103,102,99,97,95,91,89,88,86,84,83,83,82,82,79,77,77,76,74,74,72,72,71,70,69,66,65,64,61,61,58,56,56,54,53,51,49,48,47,46,46,46,45,42,40,39,38,37,36,36,35,34,33,33,32,30,30,29,28,28,27,25,25,23,22,21,20,20,20,20,20,19,19,18,18,18,16,16,15,15,15,15,15,14,14,14,14,14,14,14,14,14,14,14,13,13,12,12,11,11,11,11,11,11,11,11,11,11,11,11,11,11,10,10,10,9,8,8,8,8,8,8,8,8,8,8,8,8,7,7,6,6,6,6,5,5,5,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,1,1)

I want to fit this data. I break this data into 2 regions (head & tail). One where points are less than 100 (head) and the rest (for > 100) - tail. I could fit an exponential to the head portion (see the code below). For the tail, I want to fit a curve and want to plot both the portions in a single plot along with the data. Can anyone help?

dat.1951<-dat.1951[dat.1951 > 0]
dat.1951.tail<-dat.1951[dat.1951 >= 100]
dat.1951.head<-dat.1951[dat.1951 < 100]
x.head<-seq(1,length(dat.1951.head))
log.data<-log(dat.1951.head)
idf.head<-data.frame(x.head,dat.1951.head)
idf.head$dat.1951.head<-log(idf.head$dat.1951.head)
model=lm(idf.head$dat.1951.head ~ idf.head$x.head,data=idf.head)
summary(model)

plot(dat.1951.head)
lines(idf.head$x.head,exp(fitted(model)),col="blue")

"I want to fit a curve". What does that mean? – Roland Jun 10 '13 at 13:27 — Roland, Jun 10 '13 at 13:27

score 1 · Answer 1 · answered Jun 10 '13 at 15:38

I'm not sure why you want to (1) break the data into two regions, (2) eliminate records where there was no rainfall, and (3) fit the model you describe. You may want to consult with a statistician on these matters.

To answer your question, though, I came up with an example for a second model and show the fits from both models on the same plot.

x <- seq(dat.1951)
sel <- dat.1951 >= 100
model1 <- lm(dat.1951[sel] ~ poly(x[sel], 2))
model2 <- lm(log(dat.1951[!sel]) ~ x[!sel])

plot(dat.1951, cex=1.5)
lines(x[sel], fitted(model1), col="blue", lwd=3)
lines(x[!sel], exp(fitted(model2)), col="navy", lwd=3)

Just for grins, I added a third model that fits all of the data with a generalized additive model using the function gam() from the package mgcv.

library(mgcv)
model3 <- gam(dat.1951 ~ s(x))
lines(x, fitted(model3), col="orange", lwd=3, lty=2)

dividing data and fitting in R

1 Answers1