Let's say I have two classes, class A (100 samples) and class B (100 samples) in training but while testing the class A has (1000 samples) and class B has (100 samples). How am I supposed to calculate and use weights for weighted CrossEntropy Loss. I am confused if it should be 0.5, 0.5 or not. How can I represent the true distribution?
1 Answers
Best setups usually don't have weights.
You obviously have a skew between the distribution of classes (1:1 vs 10:1). If you can't do anything else set the weight for class A to 10 and B to 1 to correctly represent the final distribution. They don't have to add up to 1 and you can lower learning rate if you use high values (it's just a multiplier on the loss component).
This is a rough fix, the proper one is to fix the sampling so you have the same imbalance in the training input set as well (and they should match what you see in practice).
The above things are not true if making a mistake in B is significantly more "costly" in practice than making a mistake in A (i.e. missing a stock jump is just a missed opportunity, but buying stock and not getting a stock jump costs you money). In this case you do want to have higher penalty on B.

- 11,863
- 22
- 26
-
So two questions: `(A)` If I were to resample the training set and make the distribution (10:1) to represent the final distribution, then what will be the weights? `(B)` What is the difference between the weights for the loss and penalty. I have seen certain classifier which put higher "weights" and call them "penalty". I do understand what are you trying to say, but how to take into account weights and penalty? – amy Jan 25 '19 at 15:37
-
How do I add "cost" when I am doing cross entropy loss? – amy Jan 27 '19 at 02:24