5

I'm trying to understand the differences betweens In-graph replication and Between-graph replication described in distributed tensorflow, specially on how the data is being synced between multiple devices.

My understanding is that in in-graph replication, each worker does NOT keep a replica of the model locally, which seems to be the main difference with between-graph replication.

Therefore, for in-graph replication, each input tensor to each operation will be a connection to where the Variable has been stored (probably on a different Parameter Server machine). While in between-graph replication, data is being pulled in batch to sync all the parameters.

Am I right on this interpretation?
Does this mean that for each operation in in-graph replication the data is being pulled from PS?
Is this pulling sync or async?

MBZ
  • 26,084
  • 47
  • 114
  • 191
  • 1
    We have the similar question so I ask in Google Group, checkout it out https://groups.google.com/forum/#!topic/tensorflow/yQgXrm7QqDM – tobe Jun 26 '16 at 14:16
  • 1
    personally, I found this [post](http://stackoverflow.com/questions/41600321/distributed-tensorflow-the-difference-between-in-graph-replication-and-between) really helpful! – Jor Feb 25 '17 at 07:25

0 Answers0