1

I look for a solution to train a DNNClassifier (4 classes, 20 numeric features) from imbalanced rewarded samples datafile. Each class represents a game action and reward the action score. Features are given observations. So it looks as QLearning model... But QLearning model is a dataless on-line training method.

I tried to manage with samle weights with following formula :

weight = ((reward-minreward)/(maxreward-minreward))*(totalsamples/classsamples)

with 180k samples, poor accuracy ; 490k samples accuracy of 83 % ; not enought to be good.

So what is the best way to perform this :

  • with weight as I did but with more samples or other formula
  • with a QLearning algorithm (but don't know how to do...)
  • with a Learning to Rank algorithm (did not found any good and complete tutorial)

Thanks for answer

GerardL
  • 81
  • 7
  • also tried with : weight = mats.exp(reward)*(totalsamples/classsamples) and weight=((reward-minreward)/(maxreward-minreward))*(1-classsamples/totalsamples). A little less accurate... about 1 ou 2 %. No significative difference – GerardL Jan 24 '20 at 08:49

0 Answers0