3

Here is part of get_updates code from SGD from keras(source)

moments = [K.zeros(shape) for shape in shapes]
self.weights = [self.iterations] + moments
for p, g, m in zip(params, grads, moments):
    v = self.momentum * m - lr * g  # velocity
    self.updates.append(K.update(m, v))

Observation:

Since moments variable is a list of zeros tensors. Each m in the for loop is a zero tensor with the shape of p. Then the self.momentum * m, at the first line of the loop, is just a scalar multiply by zero tensor which result a zero tensor.

Question

What am I missing here?

Sunderam Dubey
  • 1
  • 11
  • 20
  • 40
oak
  • 2,898
  • 2
  • 32
  • 65

1 Answers1

2

Yes - during a first iteration of this loop m is equal to 0. But then it's updated by a current v value in this line:

self.updates.append(K.update(m, v))

So in next iteration you'll have:

v = self.momentum * old_velocity - lr * g  # velocity

where old_velocity is a previous value of v.

Marcin Możejko
  • 39,542
  • 10
  • 109
  • 120
  • 1
    Thanks for the answer. When `Keras` training function is running. It actually calls to each `symbolic compiled` function from `self.updates` array. As each element in this function is symbolic, it allows to keep the old values of the calculation as well. i.e: `moments` array is actually been declared only once. the `K.update(m,v)` contains the full symbolic information about `v`. Did I get it right? – oak Jul 04 '17 at 12:05
  • 1
    I think - yes - you've got it. – Marcin Możejko Jul 04 '17 at 12:08