3

I am working on a project and it happens that some data is provided in form of S3fileSystem. I can read that data using S3FileSystem.open(path). But there are more than 360 files and it takes atleast 3 minutes to read a single file. I was wondering, is there any way of downloading these files in my system and read them from there, instead of reading it directly from S3fileSystem. There is another reason, although I can read all those files but once my session on colab reconnects I have to re-read all those files again, hence it will take a lot of time. I am using following code to read files

fs_s3 = s3fs.S3FileSystem(anon=True)
s3path = 'file_name'
remote_file_obj = fs_s3.open(s3path, mode='rb')
ds = xr.open_dataset(remote_file_obj, engine= 'h5netcdf')

Is there any way of downloading those files?

Andrew Gaul
  • 2,296
  • 1
  • 12
  • 19
MK_07
  • 35
  • 7

1 Answers1

2

You can use another s3fs to mount the bucket, then copy the files to Colab.

how to mount

After mounting, you can

!cp /s3/yourfile.zip /content/
korakot
  • 37,818
  • 16
  • 123
  • 144