I want to use SGD optimizer in tf.keras. But SGD detail said
Gradient descent (with momentum) optimizer.
Dose it mean SGD doesn't support "Randomly shuffle examples in the data set phase"?
I checked the SGD source,
It seems that there is no random shuffle method.
My understanding about SGD is applying gradient descent for random sample.
But it does only gradient descent with momentum and nesterov.
Does the batch-size which I defined in code represent SGD random shuffle phase?
If so, it does randomly shuffle but never use same dataset, doesn't it?
Is my understanding correct?
I wrote code about batch as below.
(x_train, y_train)).shuffle(10000).batch(32)
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)