0

If I have a data frame with, let's say, factors in columns 1 and 2 and the label in column 3 , is there any difference between:

train_pool <- catboost.load_pool(data = training[,1:2], label = training[,3])

And

train_pool <- catboost.load_pool(data = training[,1:2], 
                                 label = training[,3], cat_features=c(1,2))

This is, does it autodetect that the columns 1&2 are factors and transform it to numbers even when i do not expicitly declare it in the catboost.load_pool?

b) Is there anyway in the R package to get the matrix with the categorical values transformed to numbers?

jay.sf
  • 60,139
  • 8
  • 53
  • 110

1 Answers1

0

It may be offtop, but for my classification task i have category labels and for transform into integer I used following code:

train_pool <- catboost.load_pool(data = train[-1],
                                 label = as.numeric(as.factor(train[,1]))-1)

Before transformation they look like: enter image description here

You can try manually transform features.

tuomastik
  • 4,559
  • 5
  • 36
  • 48