0

I have a problem using the videohash package for python when deployed to Azure function.

My deployed azure function does not seem to be able to use a nested dependency properly. Specifically, I am trying to use the package “videohash” and the function VideoHash from it. The input to VideoHash is a SAS url token for a video placed on an Azure blob storage.  

In the monitor of my output it prints: 

Error log

Accessing the sas url token directly takes me to the video, so that part seems to be working.  

Looking at the source code for videohash this error seems to occur in the process of downloading the video from a given url (link: https://github.com/akamhy/videohash/blob/main/videohash/downloader.py). 

Code snippet

.. where self.yt_dlp_path = str(which("yt-dlp")). This to me indicates, that after deploying the function, the package yt-dlp isn’t properly activated. This is a dependency from the videohash module, but adding yt-dlp directly to the requirements file of the azure function also does not solve the issue. 

Any ideas on what is happening? 

 

Deploying code to Azure function, which resulted in the details highlighted in the issue description.

shafee
  • 15,566
  • 3
  • 19
  • 47

1 Answers1

0
  • I have a work around where you download the video file on you own instead of the videohash using azure.storage.blob

  • To download you will need a BlobServiceClient , ContainerClient and connection string of azure storage account.

  • Please create two files called v1.mp3 and v2.mp3 before downloading the video.

file structure: enter image description here

Complete Code:

import  logging

from  videohash  import  VideoHash

import  azure.functions  as  func

import  subprocess

import  tempfile

import  os

from  azure.storage.blob  import  BlobServiceClient, BlobClient, ContainerClient

  
  

def  main(req: func.HttpRequest) -> func.HttpResponse:

    # local file path on the server
    
    local_path = tempfile.gettempdir()
filepath1 = os.path.join(local_path, "v1.mp3")

filepath2 = os.path.join(local_path,"v2.mp3")


    # Reference to Blob Storage
    client = BlobServiceClient.from_connection_string("<Connection String >")

    # Reference to Container
    container = client.get_container_client(container= "test")

      # Downloading the file 

    with  open(file=filepath1, mode="wb") as  download_file:
            download_file.write(container.download_blob("v1.mp3").readall())

    with  open(file=filepath2, mode="wb") as  download_file:
        download_file.write(container.download_blob("v2.mp3").readall())

    // video hash code . 
    videohash1 = VideoHash(path=filepath1)
    videohash2 = VideoHash(path=filepath2)
    t = videohash2.is_similar(videohash1)
    return  func.HttpResponse(f"Hello, {t}. This HTTP triggered function executed successfully.")

Output :

enter image description here

Now here I am getting the ffmpeg error which related to my test file and not related to error you are facing.

This work around as far as I know will not affect performance as in both scenario you are downloading blobs anyway

Mohit Ganorkar
  • 1,917
  • 2
  • 6
  • 11