2

I'm trying to use the tf-slim library to train Inception V3 using multiple machines which do not have GPUs.

I'm following the tutorial given here: https://github.com/tensorflow/models/tree/master/slim. I can train using a single CPU only machine but I'm trying to figure out how to do the training using multiple machines. Its mentioned in the page that:

This process may take several days, depending on your hardware setup. For convenience, we provide a way to train a model on multiple GPUs, and/or multiple CPUs, either synchrononously or asynchronously. See model_deploy for details.

I can't figure out how to use model_deploy to train using multiple machines and neither can I find any examples to do so.

How do I go about using model_deploy to train using multiple machines?

Thanks in advance!

Darshan
  • 181
  • 1
  • 10

0 Answers0