2

I have 10 datasets with binary and multiclass factors, I used logistic regression with R "glm" which predicts the class probability class,prediction(formula,data,type="response"). How can I get the predicted class instead, like other models give? For example:

df=data.frame(y=c(1,0,0,1),x1=c(1,2,3,4),x2=c(12,13,43,3))
df$y=as.factor(df$y)
testdf=data.frame(y=c(1,1,0,0),x1=c(11,16,65,8),x2=c(3,2,5,0))
testdf$y=as.factor(testdf$y)
model_glm=glm(y~.,data=df,family="binomial")
pred_glm=predict(model_glm,newdata=testdf,type="response")

this will give the probability prediction:

> pred_glm
           1            2            3            4 
2.220446e-16 2.220446e-16 2.220446e-16 2.220446e-16
  • However I need the class prediction whether it is 0 or 1. Or the probability predictions in two columns: one for class 1 and the other for class 0?
  • And how it can used when I have multi class?
smci
  • 32,567
  • 20
  • 113
  • 146
Kamel
  • 61
  • 2
  • 6

1 Answers1

0

I had some issues doing this with my data a while back. In order to predict you must first create your logistic regression

lgr= glm(y ~ x1 + x2 +x3...x10, data=df1)

then you can predict or play around with your model from there:

new predictions: NOTE df1 and df2 must be the same length (so add NAs if need be)

predicts= predict(lgr, newdata= "df2")

Then to see it and use it for presentations, ect. I would write it to a csv

write.csv(predicts, "K:/filelocation/filename.csv")

EDIT If trying to do classifications based on predictions, you need to do specificity and sensitivity calculations. See: https://stats.stackexchange.com/questions/25389/obtaining-predicted-values-y-1-or-0-from-a-logistic-regression-model-fit/25398#25398?newreg=2f9713d7d60f427d9e123208c39f69f8

Community
  • 1
  • 1
ephackett
  • 249
  • 1
  • 15
  • many thanks for your interest, no, it is not what I mean, i update it, please see again. – Kamel Dec 03 '15 at 15:35
  • 1
    That is actually a bit more complicated then. You need to generate your own thresholds for classification. An lda or svm will do this automatically, but glm you have to find the sensitivity and specificity to set it yourself: http://stats.stackexchange.com/questions/25389/obtaining-predicted-values-y-1-or-0-from-a-logistic-regression-model-fit – ephackett Dec 03 '15 at 15:52