I am trying to open files stored in a google-storage bucket in a google-colab workbook using the TPU engine. I am however always facing the error:
FileNotFoundError: [Errno 2] No such file or directory: 'gs://vocab_jb/merges.txt'
My question is very simple: how should I make a bucket in google-storage readable from google-colab? I have tried everything:
- Making the bucket public using IAM
- Assigning a special e-mail adress to the owner
- Making the file public through LCA options
- Followed x different tutorials
- I have tried each time calling the bucket through either "gs://bucket" or "https://..."
But none of the options worked correctly. What confuses me even more is that making the bucket public worked for a limited amount of time. I have also read this post but the answers didn't help. Also, I don't really care about the rights to read or write.
I am initializing my TPU in the following way:
import os
use_tpu = True #@param {type:"boolean"}
bucket = 'vocab_jb'
if use_tpu:
assert 'COLAB_TPU_ADDR' in os.environ, 'Missing TPU; did you request a TPU in Notebook Settings?'
from google.colab import auth
auth.authenticate_user()
%tensorflow_version 2.x
import tensorflow as tf
print("Tensorflow version " + tf.__version__)
try:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver('grpc://' + os.environ['COLAB_TPU_ADDR']) # TPU detection
print('Running on TPU ', tpu.cluster_spec().as_dict()['worker'])
except ValueError:
raise BaseException('ERROR: Not connected to a TPU runtime; please see the previous cell in this notebook for instructions!')
tf.config.experimental_connect_to_cluster(tpu)
tf.tpu.experimental.initialize_tpu_system(tpu)
tpu_strategy = tf.distribute.experimental.TPUStrategy(tpu)
with open("gs://vocab_jb/merges.txt", 'rb') as f:
a = f.read()
FileNotFoundError: [Errno 2] No such file or directory: 'gs://vocab_jb/merges.txt'