0

The R package MLR supports multi-label classification that maps a feature vector into a set of discrete labels Y_1, Y_2,...,Y_k. For example, Y_1, ..., might be categorical demographic traits such as age, income, gender, and multiple of these may apply to a given example in the training data. This is sometimes called multi-task learning, I believe.

Some regression tasks, such as canonical correlation analysis, have a similar flavor, in which our labels are continuous and vector-valued. What is the best way to represent such a task in MLR? I have managed shoe-horn canonical correlation analysis into regular regression task, but I am badly abusing the predict and performance methods (I want to return a vector-valued prediction that is compared to vector-valued underlying truth).

Another approach would be to "vectorize" the training data, so that a K-valued target has each training example appearing K times. This loses some nice structure to the problem, predictions and performance evaluation, however.

Andrew M
  • 490
  • 5
  • 11

1 Answers1

0

It sounds like this would require a special type of task and learner (or wrapped learner), just like the multilabel classification.

Lars Kotthoff
  • 107,425
  • 16
  • 204
  • 204
  • Thanks. Is there any documentation or resources you can point me towards defining custom tasks and learners? – Andrew M Jun 06 '17 at 15:45
  • There's a section on custom learners in the [tutorial](https://mlr-org.github.io/mlr-tutorial/devel/html/create_learner/index.html). Unfortunately we don't have any documentation on defining custom tasks at the moment. – Lars Kotthoff Jun 07 '17 at 22:32
  • Yes, the tutorial on custom learners is helpful, but as it says "Defining a completely new type of learner that has special properties and does not fit into one of the existing schemes is of course possible, but much more advanced and not covered here." Do you have a pointer to source code I should be browsing to get a feel for what's involved? The mlr framework for crossvalidation, etc, is very good so I'd love to leverage it. – Andrew M Jun 08 '17 at 03:13
  • Well, the source for any learner will do really. Compare the implementations of classification vs. regression vs. survival vs. cluster learners. – Lars Kotthoff Jun 10 '17 at 21:12