0

Is it possible to lookup the audio metadata for a file stored in Google Cloud without having to download it? When building a Google Speech-to-Text API service you pass it a gs://bucket/file.flac, and I know the sox and ffmpeg bash and Python commands for locally stored files metadata lookup, but I can't seem figure out a way to lookup audio file metadata on Google Cloud Storage file.

Additionally if I have a gs://bucket/audio.wav, can I re-encode that using sox/py-sox and write the new audio.flac directly to gs://bucket/audio.flac? Or do I have to download the audio.wav to re-encode it?

Any thoughts or directions appreciated.

libroman2
  • 1
  • 3

1 Answers1

0

No, it is not possible to access the metadata you want directly in google Cloud Storage. Using the command gsutil ls -L gs://[bucket_name]/[file_name] will prompt the metadata of that file within the bucket. You can modify these metadata, but not the ones you are referring to. You will need to download the files, re-encode them and upload them again.

You cannot do that re-encoding operation in Cloud Storage, you will need to download the file and process it the way you want before uploading it again to your bucket. However, here is a workaround if it works for you:

Create a Cloud Function triggered when your file is uploaded. Then, retrieve the file that you just uploaded and perform any operation you want with it (such as re-encoding into .flac). After that, upload it again (careful! If you give the new file the same name and extension, it will overwrite the older one in the bucket).

About your library, Cloud Functions use Python 3.7, which for the time being does not support the py-sox library, so you will need to find another one.