0

I have installed the Caret package in R Studio. I am using this package to split the data and eventually fit it into the model. The splitting of data code is below:-

library(caret)
set.seed(1234)
trainIndex<- CreateDataPartition(y, times = 1, p = 0.5, list = F)
Training<- dataset[trainIndex,]
Validation<- dataset[-trainIndex,]

It is splitting the 50% data into training and testing sets. But when I'm fitting the data in the model using glm() command, it's taking 100% data into training.

glm(y~ dataset$x1 + dataset$x2 + dataset$x3, family = binomial(link = "logit"), data = Training)

I'm not sure what's going wrong.

rawr
  • 20,481
  • 4
  • 44
  • 78

1 Answers1

0

You would better not to use $ in model when you use data = ... argument. Model was using dataset$x... so it was taking 100% data.

glm(y~ x1 + x2 + x3, family = binomial(link = "logit"), data = Training)
Park
  • 14,771
  • 6
  • 10
  • 29