4

I am using google colab pro and the provided TPU. I need to upload a pre-trained model into the TPU.

  • TPU can load data only from a google cloud storage bucket.
  • I created a cloud storage bucket and extracted the pre-trained model files in the bucket.

Now I need to give permission to the TPU to access my private bucket, but I don't know the service account of the TPU. How do I find it?

For now I just have All:R read permission to the bucket and the TPU initialized successfully but clearly this is not the optimal solution.

fabrizioM
  • 46,639
  • 15
  • 102
  • 119

2 Answers2

5

I've been struggling with this scenario myself (although with the free version of Colab) and just got it to work. This specific use case doesn't appear to be very well-documented—it seems the official documentation mostly deals with cases involving a Compute Engine VM, rather than an auto-assigned TPU. The process that worked for me went as follows:

  1. Run Google Cloud SDK authentication and set the project (these two things may be redundant—I haven't yet tried doing just one or the other)
!gcloud auth login
!gcloud config set project [Project ID of Storage Bucket]

and

from google.colab import auth
auth.authenticate_user()
  1. Initialize TPU (from Tensorflow TPU docs)
resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
strategy = tf.distribute.experimental.TPUStrategy(resolver)
  1. Try to load the model
model = tf.keras.models.load_model('gs://[Bucket name and path to saved model]')

This initially failed, but the error message included the service account of the TPU trying to access the directory, and this is the address I gave access to as described in the Cloud Storage docs. The address is in the service-[PROJECT_NUMBER]@cloud-tpu.iam.gserviceaccount.com format but the project number isn't the Project ID of the project my bucket is in, nor a value I've been able to find anywhere else.

After I gave permissions to that service account (which I was only able to find in the error message), I was able to load and save models from my private bucket.

David Buck
  • 3,752
  • 35
  • 31
  • 35
Henry Ives
  • 61
  • 3
  • Thank you Henry, I saw that error and service account too, so at least is something stable, now is a matter of digging inside keras and see how keras retrieves it – fabrizioM May 11 '20 at 03:09
  • What storage permission level did you give it? I tried Storage Legacy Bucker Reader with Object listing, but it didn't work. – SantoshGupta7 Jun 30 '20 at 03:44
  • As outlined in the [docs](https://cloud.google.com/tpu/docs/storage-buckets), Storage Legacy Bucket Reader/Writer (depending on the task) _should_ work. That said, I gave it Owner permissions as well (perhaps violating the principle of least privilege). – Henry Ives Jul 04 '20 at 20:54
  • This worked for me. Now I am wondering if it's possible to load data directly onto the TPU, since my data doesn't take that much memory and doesn't require multiple files; it can be all loaded into memory at once. – SantoshGupta7 Jul 10 '20 at 20:12
  • For me, I can not load a model's weights, whose weights were saved to GPC using Keras callbacks. I don't know what the issue is, but the workaround is to download the weights into colab, and upload them to a new bucket. More details here https://stackoverflow.com/questions/62866698/can-not-load-model-weights-saved-to-gcp-with-keras-save-weights-need-to-transfe – SantoshGupta7 Jul 12 '20 at 21:38
2

As stated in the public documentation in order to find the service account of your Colab TPU you just need to replace the project number in the following mail address:

 service-[PROJECT_NUMBER]@cloud-tpu.iam.gserviceaccount.com

You can find your project number in the dashboard of your Google Cloud Project

After doing this you should set the access to your bucket as fine-grained access and provide access for this this account in the ACL of your bucket

Chris32
  • 4,716
  • 2
  • 18
  • 30
  • The Google Colab is not associated to any specific project. The TPU instance is auto assigned on creation, I start to doubt if it has a service account at all being somewhat "public". – fabrizioM Apr 27 '20 at 12:10
  • Then just use the `gcloud auth login` command to authenticate yourself before running the export. This will provide your notebook with an auth token for your account. Then giving permissions for this account in your bucket should work. – Chris32 Apr 27 '20 at 13:36
  • Could you please explain where should this project number be modified ? – Joachim Jan 26 '21 at 18:47