I am a newbie to R and I am trying to perform a logistic regression on a set of clinical data. My independent variable is AGE, TEMP, WBC, NLR, CRP, PCT, ESR, IL6, and TIME. My dependent variable is binomial CRKP.
After using glm.fit, I was given this error message:
glm.fit <- glm(CRKP ~ AGE + TEMP + WBC + NLR + CRP + PCT + ESR, data = cv, family = binomial, subset=train)
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred
I searched up potential problems and used the corrplot function to see if there is multicollinearity that could potentially result in overfitting.
This is what I have as the plot.
Correlation plot shows that my ESR and PCT variable are highly correlated. Similarly, CRP and IL6 are highly correlated. But they are all important clinical indicators I would like to include in the model.
I tried to use the VIF to selectively discard variables, but wouldn't that be biased and also I would have to sacrifice some of my variables of interest.
Does anyone know what I can do? Please help. Thank you!