2

While trying to implement neural network training algorithms, I came across different concepts, including that of Gradient descent which tries to mimic a ball rolling down a hill, and of velocity and momentum to better model the rolling ball.

I initialized my weights, weight_deltas, and weight_velocities thus:

sizes = [2, 3, 1]
momentum_coefficient = 0.5    
weights = [ 2 * np.random.random((a, b)) - 1 for a, b in zip(sizes[:-1], sizes[1:]) ]
weight_velocities = [ np.ones(w.shape) for w in weights ]
weight_deltas = [ np.zeros(w.shape) for w in weights ]

After calculating the deltas (derivative of the cost function with respect to the weights, I updated the weights thus:

for l in xrange(sizes - 1):
    weight_velocities[l] = (momentum_factor * weight_velocities[l]) - weight_deltas[l]
    weights[l] += weight_velocities[l]

I used np.zeros to initialise my velocities, and I was able to get up to 80% accuracy (for a particular dataset). But when I initialised with np.ones, I could not get up to 20% accuracy. I've been using ones, but I can't figure out why zeros does work. And there's also the random method from numpy.

What's the recommended approach to initialise the weight_velocities? Notice that I intentionally excluded the biases units, and the learning rate, and I'm importing numpy as np.

elchroy
  • 112
  • 6

0 Answers0