I'm trying to understand the differences betweens In-graph replication and Between-graph replication described in distributed tensorflow, specially on how the data is being synced between multiple devices.
My understanding is that in in-graph replication, each worker does NOT keep a replica of the model locally, which seems to be the main difference with between-graph replication.
Therefore, for in-graph replication, each input tensor to each operation will be a connection to where the Variable has been stored (probably on a different Parameter Server machine). While in between-graph replication, data is being pulled in batch to sync all the parameters.
Am I right on this interpretation?
Does this mean that for each operation in in-graph replication the data is being pulled from PS?
Is this pulling sync or async?