0

I have data that consists of discrimination Between Two Species of Microtus using both Classified and Unclassified Observations

I built a logistic model from the 89 specimens that I used to predict the group membership of the remaining 199 specimen’s

a sample of my data

Group           M1      M2      Fora    Phone   len     height  Rost
1   multiplex       2078    1649    1708    3868    5463    2355    805
2   subterraneus    1749    1482    1462    3797    4855    2218    765 
3   unknown         1841    1562    1585    3750    5024    2232    821

I split the data into 89 observation to train my model and kept 199 unknown observations to be predicted

train.data = microtus[c(1:89),c(1:9)]
test.data = microtus[c(90:288),c(1:9)]
train.data$Group =ifelse(train.data$Group=="multiplex", 1, 0)

My Model

model <- glm(Group ~ M1Left + M3Left + Foramen + Length + Height, 
    family = binomial(), data = train.data)
summary(model)

Predictions

pred <- predict(model, test.data, type = "response")

I built a confusion matrix

createConfusionMatrix=function(actual, preds){
  predClass=ifelse(preds<0.5, 0, 1)

  table(actual,predClass)
}
## Confusion matrix 
createConfusionMatrix(test.data$Group,pred)

my output

              predClass
actual           0   1
  multiplex      0   0
  subterraneus   0   0
  unknown       70 129

This output does not seem right to me?

Can I get help on how to build a confusion matrix?

Mo.Muse
  • 1
  • 1

1 Answers1

0

Your code is working (using the data you shared):

#Code
i <- sample(1:3,288,replace = T)
#Data
microtus <- df[i,]
#Split
train.data = microtus[c(1:89),]
test.data = microtus[c(90:288),]
train.data$Group =ifelse(train.data$Group=="multiplex", 1, 0)
#Model
model <- glm(Group ~ M1 + M2 + Fora + len + height, 
             family = binomial(), data = train.data)
#Predict
test.data$pred <- predict(model, test.data, type = "response")
#Check
createConfusionMatrix=function(actual, preds){
  predClass=ifelse(preds<0.5, 0, 1)
  
  table(actual,predClass)
}
## Confusion matrix 
createConfusionMatrix(test.data$Group,test.data$pred)

Output:

              predClass
actual          0  1
  multiplex     0 75
  subterraneus 57  0
  unknown      67  0

Some data used:

#Data
df <- structure(list(Group = c("multiplex", "subterraneus", "unknown"
), M1 = c(2078L, 1749L, 1841L), M2 = c(1649L, 1482L, 1562L), 
    Fora = c(1708L, 1462L, 1585L), Phone = c(3868L, 3797L, 3750L
    ), len = c(5463L, 4855L, 5024L), height = c(2355L, 2218L, 
    2232L), Rost = c(805L, 765L, 821L)), class = "data.frame", row.names = c("1", 
"2", "3"))

Maybe start a new fresh session and try again.

Duck
  • 39,058
  • 13
  • 42
  • 84