0

It's very awesome to see that tensorflow-federated could support distributed training now. I referred to the example here. However, it seems the training data are sent from server to client at each epoch, and the client(remote_executor_service) doesn't hold any dataset. It is different from typical federated learning scenario. So I was wondering could I place training data separately on each client?

1 Answers1

1

Two things to note:

  • The example linked (High-performance Simulation with Kubernetes) is running a federated learning simulation. At this time TensorFlow Federated is primarily used for executing research simulations. There has yet to be a means for deploying to real smart phones. In the simulation, each client dataset is logically separate, but may be physically present on the same machine.

  • The creation of a tf.data.Dataset (e.g. the definition of train_data in the tutorial) could be thought of like creating a "recipe to read data", than actually reading the data itself. For example, adding .batch() or .map() calls to a dataset returns a new recipes, but doesn't actually materialize any dataset examples. The dataset isn't actually read until a .reduce() call, or the dataset is iterated over in a for loop. In the tutorial, the dataset "recipe" is sent to remote workers; the data is read/materialized remotely when the dataset is iterated over during local training (the data itself is not sent to the remote workers).

Zachary Garrett
  • 2,911
  • 15
  • 23