I have a matrix of features (in columns) where the last column is a class label. Observations are in rows.
I use rpart
in R to build a decision tree over a subset of my data and test it with predict using the rest of the data. The code to learn the tree is
fTree <- rpart(feature$a ~ feature$m, data = feature[fold != k, ],
method = "class", parms = list(split = "gini"))
The code to test it is
predFeature <- predict(fTree, newdata = feature[fold == k, ],
type = "class")
where k
is an integer that I use to select a subset of the data, while fold
is a matrix I use to create different subsets.
I get a warning message that I know some of you know already:
'newdata' had 306 rows but variables found have 3063 rows.
I read a post related to this but I failed in understanding the reason. So, further help is appreciated. Thanks in advance.