0

I don't understand if the MirroredStrategy has any impact on training outcome.

By that, I mean: Is the model trained on a single device the same as a model trained on multiple devices?

I think it should be the same model, because it's just a distributed calculation of the gradients, isn't it?

Domi W
  • 574
  • 10
  • 15

1 Answers1

1

Yes, the model trained on a single GPU and multiple GPUS (on a single machine) is the same. That is, the variables in the model are replicated and in sync on all GPU's, as per the documentation.

Alexander Ejbekov
  • 5,594
  • 1
  • 26
  • 26
  • Thank you! And am I also right assuming that training in an asynchronous way, the resulting model differs from a model trained in a non-parallel way? – Domi W Apr 06 '20 at 15:45
  • 1
    Depending on what you are training, results may vary ever so slightly but that has more to do with entropy rather than anything. In most cases you should achieve the same or near identical results. – Alexander Ejbekov Apr 06 '20 at 18:17