So I'm trying to make a photo classifier with 150 classes. I'm trying to run it on google colab TPUs, I understood I need a tfds
with try_gcs = True
for it & for that I need to put a dataset on google colab cloud. So I converted a generator to a tfds, stored it locally using
my_tf_ds = tf.data.Dataset.from_generator(datafeeder.allGenerator,
output_signature=(
tf.TensorSpec(shape=(64,64,3), dtype=tf.float32),
tf.TensorSpec(shape=(150), dtype=tf.float32)))
tf.data.experimental.save(my_tf_ds,filename)
Then I sent it to my bucket on GCS. But when I try to load it from my bucket with
import tensorflow_datasets as tfds
dsFromGcs = tfds.load("pokemons",data_dir = "gs://dataset-7000")
It doesn't work and gives available datasets like :
- abstract_reasoning
- accentdb
- aeslc
- aflw2k3d
- ag_news_subset
- ai2_arc
- ai2_arc_with_ir
- amazon_us_reviews
- anli
- arc
that are not on my GCS bucket.
When loading it myself from local:
tfds_from_file = tf.data.experimental.load(filename, element_spec= (
tf.TensorSpec(shape=(64,64,3), dtype=tf.float32),
tf.TensorSpec(shape=(150), dtype=tf.float32)))
it works, the dataset is fine.
So I don't understand why I can't read it on gcs, can we read private ds on GCS? Or only already defined datasets. I also gave the role Storage Legacy Bucket Reader
on my Bucket to the public.