Does `tf.distribute.MirroredStrategy` have an impact on training outcome?

Question

I don't understand if the MirroredStrategy has any impact on training outcome.

By that, I mean: Is the model trained on a single device the same as a model trained on multiple devices?

I think it should be the same model, because it's just a distributed calculation of the gradients, isn't it?

score 1 · Accepted Answer · answered Apr 06 '20 at 15:41

1

Yes, the model trained on a single GPU and multiple GPUS (on a single machine) is the same. That is, the variables in the model are replicated and in sync on all GPU's, as per the documentation.

answered Apr 06 '20 at 15:41

Alexander Ejbekov

5,594
1
26
26

Thank you! And am I also right assuming that training in an asynchronous way, the resulting model differs from a model trained in a non-parallel way? – Domi W Apr 06 '20 at 15:45
1

Depending on what you are training, results may vary ever so slightly but that has more to do with entropy rather than anything. In most cases you should achieve the same or near identical results. – Alexander Ejbekov Apr 06 '20 at 18:17

Does `tf.distribute.MirroredStrategy` have an impact on training outcome?

1 Answers1