-1

I am trying to create an automated pipeline that gets files from this api fiftyone and load it to s3. From what I saw the fiftyone package can only download it locally.

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset(
"open-images-v6",
split="validation",
classes=["Cat","Dog"],
max_samples=100,
label_types=["detections"],
seed=51,
dataset_name="open-images-pets"

Thats the code I use to download the files, thing is they download locally. Anyone that has some experience with this and how could this be done?

Thank you!

Amdbi
  • 11
  • 3
  • browse their docs, make sure it can't do what you want, then submit a feature request https://github.com/voxel51/fiftyone/issues – Christoph Rackwitz Jun 06 '22 at 21:25
  • hmm, that might require some time for them to actually process that feature...but out of curriosity lets say I download it locally and then load it to s3 using boto3, if I were to push this code in an automated pipeline in sagemaker, would that work? – Amdbi Jun 07 '22 at 10:30
  • now that's beyond me. I'm here because you tagged [tag:computer-vision], not because there's S3 involved, or "pipelines in sagemaker". -- sounds feasible anyway. nothing prevents you from figuring out where that 51 thing puts its cached model/weight files, grabbing them, and doing whatever. – Christoph Rackwitz Jun 07 '22 at 10:58

1 Answers1

0

You're right that the code snippet that you shared will download the files from Open Images to whatever local machine you are working on. From there, you can use something like boto3 to upload the files to s3. Then, you may want to check out the examples for using s3fs-fuse and FiftyOne to see how you can mount those cloud files and use them in FiftyOne.

Directly using FiftyOne inside of a Sagemaker notebook is in development.

Note that FiftyOne Teams has more support for cloud data, with methods to upload/download to the cloud and use cloud objects directly rather than with s3fs-fuse.

Eric Hofesmann
  • 504
  • 2
  • 7