I am building a python cloud video pipeline that will read video from a bucket, perform some computer vision analysis and return frames back to a bucket. As far as I can tell, there is not a Beam read method to pass GCS paths to opencv, similar to TextIO.read(). My options moving forward seem to download the file locally (they are large), use GCS fuse to mount on a local worker (possible?) or write a custom source method. Anyone have experience on what makes most sense?
My main confusion was this question here
Can google cloud dataflow (apache beam) use ffmpeg to process video or image data
How would ffmpeg have access to the path? Its not just a question of uploading the binary? There needs to be a Beam method to pass the item, correct?