0

Please help if you can! I have a lot of individual images stored in a google bucket. I want to retrieve individual images from the bucket through google colab. I have already set up a connection via gcsfuse but I can still not access the images.

I have tried:

I = io.imread('/content/coco/Val/Val/val2017/000000000139.jpg')
I = file_io.FileIO('/content/coco/Val/Val/val2017/000000000139.jpg', 'r')
I = tf.io.read_file('/content/coco/Val/Val/val2017/000000000139.jpg', 'r')

None have worked and I am confused.

io.imread returns "None"

file_io.FileIO returns "tensorflow.python.lib.io.file_io.FileIO at 0x7fb7e075e588" which I don't know what to do with.

tf.io.read_file returns an empty tensor.

(I am actually using PyTorch, not Tensorflow but after some google searches, it seemed TensorFlow might have the answer.)

Andrew Gaul
  • 2,296
  • 1
  • 12
  • 19
Joshua Clancy
  • 121
  • 1
  • 9
  • _file_io.FileIO returns "tensorflow.python.lib.io.file_io.FileIO at 0x7fb7e075e588" which I don't know what to do with._ Have you checked the documentation for that method? – AMC Apr 19 '20 at 02:59
  • Yeah, I went looking for about two hours but didn't get anywhere. The documentation is a bit of a mess. I'll be trying again today. – Joshua Clancy Apr 19 '20 at 17:42
  • Yeah, I went looking further and have now accessed that filetype and have determined that this too returns an empty file. I am still very confused. I uploaded all of those files and can access them on the bucket console page, but when I access via python they are suddenly empty. The Google documentation is no help, and multiple related StackOverflow Pages dance around the issue. (There seems to be a solution for python 2 but not Python 3). – Joshua Clancy Apr 19 '20 at 17:52

1 Answers1

1

Is unclear to me if your issue is with copying files from Google Cloud Storage to Colab or accesing a file in Colab with Python

As stated in the Colab documentation In order to use Google Cloud Storage you should be using the gsutil tool.

Anyways I tried myself to use the gcsFUSE tool by following this steps and I was able to see the objects of my bucket by running the !ls command

Steps:

from google.colab import auth
auth.authenticate_user()

Once you run this, a link will be generated, you can click on it and get the signing in done.

!echo "deb http://packages.cloud.google.com/apt gcsfuse-bionic main" > /etc/apt/sources.list.d/gcsfuse.list
!curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
!apt -qq update
!apt -qq install gcsfuse

Use this to install gcsfuse on colab.

!mkdir folderOnColab
!gcsfuse folderOnBucket folderOnColab

Replace the folderOnColab with the desired name of your folder and the folderOnBucket with the name of your bucket removing the gs:// preceding the name.

By following all these steps and running the !ls command I was able to see the files form my bucket in the new folder in Colab.

Chris32
  • 4,716
  • 2
  • 18
  • 30