0

I have a Keras model, but it's too big for my local PC and I'm trying to migrate to Google cloud to be able to use TPU.

The examples that I have seen uses in memory images to train the model with fit function.

I have thousands of images and also I want to use image augmentation. In my local model I use ImageDataGenerator and fit_generator.

How do I do this using the TPU?

I have several ideas,

  1. To mount a bucket in the virtual machine
  2. Copy the images to the disk of virtual machine and use ImageDataGenerator as I do in my local machine.

But I'm not sure and I feel that all of these methods are inefficient.

Is there a way to do it efficiently?

michaelb
  • 252
  • 1
  • 6
Mquinteiro
  • 1,034
  • 1
  • 11
  • 31
  • Is the question how to implement ImageDataGenerator & fit_generator? Or is your question more so “How can I mount a large amount of images in a Google Cloud Platform VM”? What do you mean by “How do I do this using the TPU?” – Milad Tabrizi Sep 11 '18 at 18:13
  • @Milad, the question is about if I can use fit_generator in the same way than fit. And if it is a good idea mount a large amount of images using fuse fs, I have the filling what it must be very slow. Or about is there another way to do it better. – Mquinteiro Sep 12 '18 at 08:09

2 Answers2

0

If you’re looking for read speed, GCP does offer SSD’s, which would be the fastest way for your machine to read the images. Local SSD’s do have a 3TB limit, so you may have to attach multiple to your VM depending on the number of images.

If you’re looking to reduce costs, mounting a Bucket with FuseFS is the way to go, but it would be the slowest option as the potential distance from the source is the greatest.

Google has a great article that explains the different options you have for storage. The article also has tables that lay out the different costs as well as speeds, & other great technical details about what each option offers.

xavierc
  • 502
  • 2
  • 13
0

Tensorflow recently announced support for Keras on Cloud TPU's (as of 1.11) so your existing model with fit_generator should work, here is an example using fit_generator on a TPU

For the performance part of your question, once you have the model running on TPU you can use the TPU profiler to determine if storage is a bottleneck. If storage is a bottleneck there are a number of ways around this, mostly optimizing the input pipeline

Wyck
  • 10,311
  • 6
  • 39
  • 60
michaelb
  • 252
  • 1
  • 6