1

My code looks like below and it is setupped in azure in such a way that the trigger is happening properly but in result I'm getting some buffer or encoded values to be check:

  1. output result is not genrating a plain text as a transcription result we are getting some audio buffer or an encoded results that need to be fixed 2.sending the audio files they are in .wav format itself hence their is no format issue i thing

import logging
import azure.functions as func
import azure.cognitiveservices.speech as speechsdk
from azure.storage.blob import BlobServiceClient, ContentSettings

# Azure Speech Service configuration

speech_key = "speech-key"
service_region = "eastus"

# Configure Speech Recognizer

speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

def main(blob: func.InputStream, outputBlob: func.Out\[str\]):
logging.info(f"Python blob trigger function processed blob\\n"
f"Name: {blob.name}\\n"
f"Blob Size: {blob.length} bytes")

    # Get the audio data from the blob
    audio_data = blob.readall()
    
    # Create an audio stream from the audio data
    audio_stream = speechsdk.audio.AudioDataStream(audio_data)
    
    # Create a speech recognizer
    speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_stream)
    
    # Perform speech recognition
    result = speech_recognizer.recognize_once()
    
    # Get the transcript
    transcript = result.text if result.reason == speechsdk.ResultReason.RecognizedSpeech else ""
    
    # Save transcript as a text file
    output_file = "output.txt"
    with open(output_file, "w") as file:
        file.write(transcript)
    
    # Save transcript to a new blob
    connection_string = "DefaultEndpointsProtocol=https;AccountName=scribeemrnewtesting8a49;AccountKey=RvyepyfHuKQjIrbPLSnx36zNB65l64kUurdSFi903DhXdz+pzbFTgpuNk6yviESmPlIVDFqMkjxU+AStRPdFAA==;EndpointSuffix=core.windows.net"
    output_container_name = "text-output"
    blob_name = blob.name.replace(".wav", ".txt")
    
    blob_service_client = BlobServiceClient.from_connection_string(connection_string)
    output_container_client = blob_service_client.get_container_client(output_container_name)
    
    # Upload transcript to output blob container/
    output_blob_client = output_container_client.get_blob_client(blob_name)
    with open(output_file, "rb") as file:
        output_blob_client.upload_blob(file, overwrite=True, content_settings=ContentSettings(content_type='text/plain'))
    
    blob.delete()
    
    logging.info(f"Transcription completed and saved to the out-blob container: {transcript}")
    outputBlob.set(transcript)

getting output as: RIFF�� WAVEfmt �> } data�� �������� �� ! �� �� �� % ���� "

type here

expected : A plain text which has transcription as per the above logic

1 Answers1

0

Firstly, I created a container for input blob and didn't uploaded any blob to that as below,

enter image description here

And, I created another container for output blob and didn't upload any blob to that as below,

enter image description here

I made some changes with your code and got the expected text output with input .wav blob to .txt blob and also the transcription of output.txt.

Code:

import logging
import azure.functions as func
import azure.cognitiveservices.speech as speechsdk
from azure.storage.blob import BlobServiceClient, ContentSettings
import tempfile

speech_key = "<key>"
service_region = "<region>"

speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

def main(blob: func.InputStream, outputBlob: func.Out[str]):
    logging.info(f"Python blob trigger function processed blob\n"
                 f"Name: {blob.name}\n"
                 f"Blob Size: {blob.length} bytes")

    try:
        audio_data = blob.read()

        with tempfile.NamedTemporaryFile(delete=False) as temp_file:
            temp_file.write(audio_data)
            temp_file.seek(0)
            audio_config = speechsdk.audio.AudioConfig(filename=temp_file.name)
            speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
            result = speech_recognizer.recognize_once()

        transcript = result.text if result.reason == speechsdk.ResultReason.RecognizedSpeech else ""
        output_file = "output.txt"
        with open(output_file, "w") as file:
            file.write(transcript)

        connection_string = "<storage-connec-string>"
        output_container_name = "text-output"
        blob_name = blob.name.replace(".wav", ".txt")

        blob_service_client = BlobServiceClient.from_connection_string(connection_string)
        output_container_client = blob_service_client.get_container_client(output_container_name)

        output_blob_client = output_container_client.get_blob_client(blob_name)
        with open(output_file, "rb") as file:
            output_blob_client.upload_blob(file, overwrite=True, content_settings=ContentSettings(content_type='text/plain'))

        logging.info(f"Transcription completed and saved to the out-blob container: {transcript}")
        outputBlob.set(transcript)

    except Exception as ex:
        logging.error(f"Exception occurred during speech recognition: {ex}")
        raise ex

function.json:

{
  "scriptFile": "__init__.py",
  "bindings": [
    {
      "name": "blob",
      "type": "blobTrigger",
      "direction": "in",
      "path": "audio-input/sample.wav",
      "connection": "AzureWebJobsStorage"
    },
    {
      "name": "outputBlob",
      "type": "blob",
      "direction": "out",
      "path": "text-output/sample.txt",
      "connection": "AzureWebJobsStorage"
    }
  ]
}

Replace the input and output container names and blob names in above function.json file

local.settings.json:

{
  "IsEncrypted": false,
  "Values": {
    "FUNCTIONS_WORKER_RUNTIME": "python",
    "AzureWebJobsStorage": "<storage-account-conne-string>",
    "AzureWebJobsFeatureFlags": "EnableWorkerIndexing"
}
}

requirement.txt:

azure-functions
azure-cognitiveservices-speech==1.21.0
azure-storage-blob

Output:

The blob trigger function runs successfully and there is no text generated at output.txt file as below,

enter image description here

Now, I uploaded a sample.wav file to my input container as below,

enter image description here

Then, the output.txt file got generated with some text of the sample.wav blob as below,

enter image description here

And a sample.txt file got created in my text-output container as below,

enter image description here

Then, I downloaded the output blob to check the text output as below,

enter image description here

The text got generated in output blob as below,

enter image description here

Dasari Kamali
  • 811
  • 2
  • 2
  • 6
  • Result: Failure Exception: ModuleNotFoundError: No module named 'azure.cognitiveservices'. Please check the requirements.txt file for the missing module. getting this error after modifying the code as per suggested and upload an audio file named"sample.wav" in input container but no result in output conatiner I tried entire thing in Azure UI itself and install the requirment.txt with the help of AZURE CLI. – Fitness Freak Jun 30 '23 at 12:00
  • @FitnessFreak "ModuleNotFoundError: No module named 'azure.cognitiveservices'," indicates that the module 'azure.cognitiveservices' is not installed in your Python environment. To resolve this issue, you need to make sure that you have the necessary module installed. – Dasari Kamali Jun 30 '23 at 12:13
  • @FitnessFreak Try installing 'azure.cognitiveservices' using this command **pip install azure-cognitiveservices-speech** and check the azure.cognitiveservices installed or not with this command **pip show azure-cognitiveservices-speech** and try to run again. – Dasari Kamali Jun 30 '23 at 12:16
  • azure-cognitiveservices-speech is installed I have checked that also, but getting the same module not found error and triggers are not at all happening – Fitness Freak Jul 06 '23 at 04:16
  • @FitnessFreak Can you check the Python version you are using for the code and provide the azure-cognitiveservices-speech version once? – Dasari Kamali Jul 06 '23 at 05:04
  • Python version : Python 3.9.14 azure-cognitiveservices-speech version: 1.21.0 – Fitness Freak Jul 06 '23 at 06:21
  • @FitnessFreak Can you try to uninstall Python 3.9.14 version and Install 3.10.0 once and try with this once? – Dasari Kamali Jul 06 '23 at 06:25
  • done by upgrading the python version ,now when I upload a file to an input container in monitor logs their no new logs are error messages trigger is not happening even now – Fitness Freak Jul 06 '23 at 07:03