I'm trying to fit a logistic regression using glm( family='binomial')
.
Here is the model:
model<-glm(f_ocur~altitud+UTM_X+UTM_Y+j_sin+j_cos+temp_res+pp,
offset=(log(1/off)), data=mydata, family='binomial')
mydata
has 76820 observations.
The response variable (f_ocur) is 0-1.
This data is a sample of a bigger dataset, so the idea of setting the offset is to account that the data used here represents a sample of the real data to be analysed.
For some reason the offset is not working. When I run this model I get a result, but when I run the same model but without the offset I get the exact same result as the previous model. I was expecting a different result but there is no difference.
Am I doing something wrong? Should the offset be with the linear predictors? like this:
model <- glm(f_ocur~altitud+UTM_X+UTM_Y+j_sin+j_cos+temp_res+pp+offset(log(1/off)),
data=mydata, family='binomial')
Once the model is ready, I´d like to use it with new data. The new data would be the data to validate this model, this data has the same columns. My idea is to use:
validate <- predict(model, newdata=data2, type='response')
And here comes my question, does the predict function takes into consideration the offset used to create the model? If not, what should I do in order to get the correct probabilities for the new data?