I'm trying to use the requests package in Python to make a call to the Microsoft Bing Speech Transcription API. I can make the call work when I use Postman, but this requires manually selecting a file to upload (Postman provides a GUI to select the file), but I'm not sure how this file selection gets mapped onto the actual HTTP request (and by extension the Python requests
request). Postman can convert its internal queries into code, and according to Postman the http request it's making is:
POST /recognize?scenarios=smd&appid=[REDACTED]&locale=en-US&device.os=wp7&version=3.0&format=json&form=BCSSTT&instanceid=[REDACTED]&requestid=[REDACTED] HTTP/1.1
Host: speech.platform.bing.com
Authorization: [REDACTED]
Content-Type: application/x-www-form-urlencoded
Cache-Control: no-cache
Postman-Token: [REDACTED]
undefined
And the equivalent request if made through the Python requests
library would be:
import requests
url = "https://speech.platform.bing.com/recognize"
querystring = {"scenarios":"smd","appid":[REDACTED],"locale":"en-US","device.os":"wp7","version":"3.0","format":"json","form":"BCSSTT","instanceid":[REDACTED],"requestid":[REDACTED]}
headers = {
'authorization': [REDACTED],
'content-type': "application/x-www-form-urlencoded",
'cache-control': "no-cache",
'postman-token': [REDACTED]
}
response = requests.request("POST", url, headers=headers, params=querystring)
print(response.text)
However note that in neither case does the generated code actually pass in the audio file to be transcribed (clearly Postman doesn't know how to display raw audio data), so I'm not sure how to add this crucial information to the request. I assume that in the case of the HTTP request code the audio stream goes in the spot displayed as "undefined". In the Python requests command, from reading the documentation it seems like the response = requests.request(...)
line should be replaced by:
response = requests.request("POST", url, headers=headers, params=querystring, files={'file': open('PATH/TO/AUDIO/FILE', 'rb')})
But when I run this query I get "Request timed out (> 14000 ms)". Any idea for how I can successfully call the Microsoft Speech API through Python? Any help would be much appreciated, thanks.