where does class_weights or weighted loss penalize the network?

Question

I am working on a Semantic segmentation project where I have to work on multiclass data which is highly imbalanced. I searched for optimizing it during training using the model.fit parameter and in that to use class_weights or sample_weights.

I can implement a following using a class_weight dictionary as

{ 0:1, 1:10,2:15 }

I also saw a method of updating weights in loss function

But at what point do these weights get updated?

If class_weights are used where will it get penalized? I already have a kernel_regularizer for each layer so if my classes have to be penalized based on my class weights then will it penalize the output of each layer y=Wx+b or only at the final layer?
Same if I use a weighted loss function will it get penalized only on the final layer before loss calculation or on each layer and then the final loss is calculated?

Any explanation on this would be very useful.

score 1 · Answer 1 · edited Dec 24 '20 at 20:34

1

The class_weights you mentioned in your dictionary are there to account for your imbalanced data. They will never change, they are only there to increase the penalty for misclassified instances of minority classes (that way your network pays more attention to them and the gradients returned treat one 'Class2' instance as if it was 15 times more important than one 'Class0' instance).

The kernel_regularizer you mention resides at your loss function and penalizes large weight norms for weight matrices throughout the network (if you use kernel_regularizer = tf.keras.regularizers.l1(0.01) in a Dense layer, it only affects that layer). So that is a different weight that has nothing to do with classes, only with weights inside your network. Your eventual loss will be something like loss = Cross_entropy + a * norm(Weight_matrix) and that way the network will have as an additional task assigned to it to minimize the classification loss (cross entropy) while the weight norms remain low.

edited Dec 24 '20 at 20:34

Timbus Calin

13,809
5
41
59

answered Dec 23 '20 at 17:38

Gaussian Prior

756
6
16

Thankyou. Just a small clarification so both will work only on the output of the network classification weights and do not do any changes to my each layer weights? Is this correct? – shankar ram Dec 24 '20 at 13:30
kernel_regularizer affects the weights of the layer for which you called it, so it may affect the weights of the first layer for example. (or you can use kernel regularizer for every layer) Class weights have nothing to do with your network weights directly. They will only influence the network weights indirectly, as training goes on, so that your changed weights account for the fact that some classes are more important than the others. So once again, class weights are not weights of the network. They influence all the weights of the network in an indirect way (through training). – Gaussian Prior Dec 24 '20 at 13:59
Thankyou this is exactly where I was confused. – shankar ram Dec 24 '20 at 15:09

where does class_weights or weighted loss penalize the network?

1 Answers1

Linked