Mini Batching Neural Network

Question

I'm trying to implement mini batching correctly for my own NN.

But I can't wrap my head about what's being summed? Do I sum the Gradient or the delta weights (where the learning rate is already multiplied) for the weight and bias which in my example are:

Delta Weight: activation'(neurons) ⊗ Error * learningRate x input

Delta Bias: activation'(neurons) ⊗ Error * learningRate

Do I also divide those summed delta weights or gradients throug the batch size?

EDIT:

So all questions summed:

Is the delta weight without the learning rate called the gradient?
Do I need to add up those delta weights with or without the learning rate multiplied
So I must save two seperate Gradients? (Bias + Weight)

score 0 · Accepted Answer · answered Jun 29 '19 at 12:53

After researching for a whole night and looking at lots of blogs / articles I came to these answers (which work for me!)

1) Nevermind, people call both the "gradient"

2) without the learning rate

3) Yes, when finishing the batch you multiply the learning rate (... and do momentum optimization if implemented)

Mini Batching Neural Network

1 Answers1