I am currently working on fitting a glm()
on a dataset with a proportion variable (C3_proportion) along a continuous gradient (elevation) and I am trying to obtain the x values for y=c(0.25,0.5,1). After some research, it seems like a binomial glm would be my best option. So here are my data, the model, and the ggplot2 plot I was able to come up with.
# Library load
library(ggplot2)
library(ggpmisc)
# Example data
df <- data.frame("mountain"="Sumapaz",
"C3_proportion"=c(0.2692308,0.2692308,0.2692308,0.2692308,0.3090909,0.3333333,0.3333333,0.3333333,0.3333333,0.3333333,0.3333333,0.3333333,0.3333333,0.3333333,0.3461538,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.3555556,0.3939394,0.3939394,0.3939394,0.3939394,0.3939394,0.3939394,0.3939394,0.3939394,0.3939394,0.4473684,0.4827586,0.4827586,0.4827586,0.4827586,0.4827586,0.4827586,0.4827586,0.4827586,0.4827586,0.6341463,0.7058824,0.7058824,0.7058824,0.7058824,0.7058824,0.7058824,0.7058824,0.7058824,0.7058824,0.7674419,0.8666667,0.8666667,0.8666667,0.8666667,0.8666667,0.8666667,0.8666667,0.8666667,0.8666667,0.875,0.8636364,0.8636364,0.8636364,0.8636364,0.8636364,0.8636364,0.8636364,0.8636364,0.8636364,0.8636364),
"elevation"=c(300,350,400,450,500,550,600,650,700,750,800,850,900,950,1000,1050,1100,1150,1200,1250,1300,1350,1400,1450,1500,1550,1600,1650,1700,1750,1800,1850,1900,1950,2000,2050,2100,2150,2200,2250,2300,2350,2400,2450,2500,2550,2600,2650,2700,2750,2800,2850,2900,2950,3000,3050,3100,3150,3200,3250,3300,3350,3400,3450,3500,3550,3600,3650,3700,3750,3800,3850,3900,3950,4000))
# Model
fit=glm(C3_proportion~poly(elevation, 2),data=df, family = "binomial")
# Plot
ggplot(df, aes(x=elevation, y=C3_proportion)) +
geom_point(aes(colour=mountain),alpha = 0.8) +
geom_smooth(method = "glm", method.args = list(family = "binomial"), formula= y ~ poly(x,2), se = F,lwd=0.5, colour="black")+
stat_smooth(method = "glm", method.args = list(family = "binomial"), formula= y ~ poly(x,2), colour = "black", lty="dotted", geom = "ribbon", fill = NA)+
xlab("Elevation (m)") + ylab("Proportion of C3 genera")
However, whenever I try to predict values, I cant run it for the y axis,
> predict(fit, newdata = data.frame(C3_proportion=c(0.25,0.5,1)))
Error in poly(elevation, 2, coefs = list(alpha = c(2150, 2150), norm2 = c(1, :
object 'elevation' not found
...and even when I tried to do so with the x values, the predicted elevations are not consistent with the ones shown by the fitted curve on the plot.
# Predict values
predict(fit, newdata = data.frame(elevation=c(1000,2000,3000)))
#1 2 3
#-0.88574136 0.07365174 1.03304484