1

I have a set of features: x1, x2, x3. Furthermore, I have a set of labels: y1, y2, y3.

For example, my x variables are height, weight and years of education. Each Yi represents a grade in the following fields: Science, Arts and Management. Each student is assigned with a grade for the co responsive field (Science, Arts, Management). I'd like to use the xgboost algorithm to identify the class with minimum score. For example, if the marks are (10, 25, 5), then the algorithm should predict the class as y3. How can I customize my objective function to achieve this task. I am an R user

  • "Each student was assigned a marks for Science, Arts, Management." What do you mean? Do you mean each student gave a score of how much he likes that field? – Eran Moshe Mar 01 '18 at 06:39
  • A mark assigned for each subject, based on the performance in the exam. Sorry for the confusion – user9382972 Mar 01 '18 at 06:49
  • And what do you want to predict? The score for each field? Do you have scores in field y1 while trying to predict the score in field y2? – Eran Moshe Mar 01 '18 at 07:20
  • I want to predict the class with minimum value, by passing y1, y2 and y3. – user9382972 Mar 01 '18 at 07:27
  • 1
    If you have y1, y2, y3 you don't need to predict anything. You just do min(y1, y2, y3). Try to re-phrase your question and define what you want to PREDICT – Eran Moshe Mar 01 '18 at 07:28
  • as @EranMoshe said this is not a prediction but ifelse type of problem. Please change the problem description to make clear why you want to predict – abhiieor Mar 01 '18 at 09:33

1 Answers1

1

In that case, not sure its the best way to solve this, but it'll solve it.

Build 3 models. Each model will predict class Yi based on x1, x2, x3. (this means you will copy your data 3 times and for each copy, you will predict the co-responsive Yi

so model1 will predict the grade of class1, model2 of class2 and so on.

Than, run a minimum problem on the results from the models. the minimum is the winner.

Use a regular "linear:reg" objective function for each model.

Evaulate your program with a simple accuracy test.

Eran Moshe
  • 3,062
  • 2
  • 22
  • 41
  • Thought that would be the right solution too (if the use of `xgboost` is mandatory). – kluu Mar 01 '18 at 09:10
  • Yes this is one approach. But the problem in we lose the information about the correlation between Y's. Yes xgboost is mandatory – user9382972 Mar 01 '18 at 09:30
  • We don't lose correlation between Y's because as you said, we will not have any of them in an unknown data. If however, you might have them, just add them as `Xs` (for example, if try to predict `y3`, add `x4=y1, x5=y2`) – Eran Moshe Mar 01 '18 at 09:43
  • I was talking about the model training process. – user9382972 Mar 02 '18 at 02:48