Best way to train a CNN on Google Colab TPU

Question

I am trying to train a CNN (ResNet50 for now) using Keras on Google Colab with their TPU support. The TPU VM on Colab has a small local disk size, so I cannot fit my training images on it.

I tried uploading the train/test images to Google drive but it appears to be rather slow to access the files from there on Colab. I set up a Google Cloud Storage (GCS) bucket to upload the data to. But cannot find good examples on how to connect the bucket to Keras and the TPU for training.

On TensorFlow website they suggest just using the GCS as a filesystem. But there is something about the fileset having to use "tf.io.gfile" for access. What does it mean with regards to Keras?

The Shakespeare TPU exaxmple shows mounting a GCS bucket and using it for model storage. So this way I can mount and reference the bucket. But it does not tell me the way to use GCS for feeding the training data from. All examples I find use some predefined set of images packed with Keras..

Some instructions seem to state that the TPU runs on its own, separate server, and the data should be on GCS for TPU to access it. If I run a Keras generator, do image augmentation, and the feed these to the training system, does this not mean I am continously downloading images over the network to the Colab VM, modifying them, and sending the over the network to the TPU server?

It all seems rather complicated to run a simple CNN model on Keras with TPU. What am I missing here, what is the correct process?

Anyone having a concrete example would be great..

About 40k images, total size on disk is about 40GB in this case — kg_sYy, Jul 11 '19 at 05:04
This isn't a full answer, but you can check out this colab, https://colab.research.google.com/github/tensorflow/gan/blob/master/tensorflow_gan/examples/colab_notebooks/tfgan_on_tpus.ipynb, where they connect a GCS bucket with a colab — Alex Ilchenko, Jul 11 '19 at 18:22
Thanks for the link. I managed to use it as a template and get everything to load from GCS. Unfortunately accessing the data from GCS is very slow. My guess is Google does not want people to be able to fit too much data into a free Colab VM and use the free TPU. Will try to downscale hard and fit it all into VM local disk so I can at least give the TPU a go. — kg_sYy, Jul 11 '19 at 20:44

Best way to train a CNN on Google Colab TPU

0 Answers0