-1

I'm training a pretty basic NN over mmnist fashion dataset. I'm using my own code, which is not important. I use a rather simplified algorith similar to ADAM and a cuadratic formula (train_value - real_value)**2 for training and error calculation. I apply a basic back propagation algorith for each weight, and I analyse 1/5 of the network weights for each trainning image. I use only a 128 layer as in the basic example for begginers in tensorflow, plus the input and output layers (the last with softmax and the first normalized to 0-1)

I'm not an expert at all, and I've only been able to train my network up to 77% accuracy over the test set.

As shown in the image bellow, I detected that the gradients of the weights for most of my neurons converge to cero after a few epochs. But there are few remarkable exceptions that just remain rebel (vertical lines at the first image divide the weights by neuron).

Could you recommend me some general techniques to train rogue neurons without affecting others?

enter image description here

enter image description here

José Chamorro
  • 497
  • 1
  • 6
  • 21

1 Answers1

1

You could add a constraint to the given kernel (the weight matrix in the Dense layer). With one of those constraints, the weights can be normalized to a given, user-defined range. See: TensorFlow.Keras Constraints

In addition you can try to use regularizers, to prevent overfitting of the mode, which may be indicated by some very large (absolute) weight values. For that see for example L1 or L2 reguralizers: TensorFlow.Keras Regularizers

Franneck
  • 31
  • 2
  • 9