From what I understand about DNN's dropout regularization is that:
Dropout:
First we randomly delete neurons from the DNN and leave only the input and output the same. Then we perform forward propagation and backward propagation based on a mini-batch; learn the gradient for this mini-batch and then update the weights and biases – Here I denote these updated weights and biases as Updated_Set_1.
Then, we restore the DNN to default state and randomly delete the neurons. Now we perform the forward and backward propagation and find a new set of weights and biases called Updated_Set_2. This process continues until Updated_Set_N ~ N represents the number of mini batches.
Lastly, we calculate the average of all weights and biases based on the total Updated_Set_N; example, from Updated_Set_1 ~ Updated_Set_N. These new average weights and biases will be used to predict the new input.
I would just want to confirm whether my understanding is correct or wrong. If wrong, please do share me your thoughts and teach me. thank you in advance.