1

What I m coding: I m build an easy neural network with a weight matrix w and a second paramter u for the score. After multipling my input vector with w, the result is multiplied with a vector u to get a result as one figure and that is my score.

Where I m right now: I calculated the gradients of both two paramter with respect to my loss function.

My problem: And now i m stuck what to do next?

My solution proposal: Can I update the paramter with w = w + learingrate * w_grad (and also for u with u = u learning rate *u_grad) and do this procedure until my cost / loss value decrease... does this work? Is this correct? Is this an esay implementation of Stochastic Gradient Descent?

I m coding in Java, if you have an easy and good documented example how to optimize a neural net in an easy way, you can share it with me.

Thanks in advance!

user3352632
  • 617
  • 6
  • 18

1 Answers1

1

I suppose that w_grad is partial derivatives. If to speak of what your solution proposal it is something that is called iterative way of optimization. Just one clarification. Instead of w = w + learingrate * w_grad you should use w = w - learingrate * w_grad. It works fine, but if you have multicore machine, it will use only one core. If you need performance boost you can try batch algorithm. w = w - learingrate * Summ(w_grad). Performance boost is achieved during w_grad calculation

Yuriy Zaletskyy
  • 4,983
  • 5
  • 34
  • 54