Using Azure functions in a python runtime env. to take latitude/longitude tuples from a snowflake database and return the respective countries. We also want to convert any non-english country names into English.
We initially found that although the script would show output in the terminal while testing on azure, it would soon return a 503 error (although the script continues to run at this point). If we cancelled the script it would show as a success in the monitor screen of azure portal, however leaving the script to run to completion would result in the script failing. We decided (partially based on this post) this was due to the runtime exceeding the maximum http response time allowed. To combat this we tried a number of solutions.
First we extended the function timeout value in the function.json
file:
{
"version": "2.0",
"logging": {
"applicationInsights": {
"samplingSettings": {
"isEnabled": true,
"excludedTypes": "Request"
}
}
},
"extensionBundle": {
"id": "Microsoft.Azure.Functions.ExtensionBundle",
"version": "[2.*, 3.0.0)"
},
"functionTimeout": "00:10:00"
}
We then modified our script to use a queue trigger by adding the output
def main(req: func.HttpRequest, msg: func.Out[func.QueueMessage]) ->func.HttpResponse:
to the main .py
script. We also then modified the function.json
file to
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "function",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": [
"get",
"post"
]
},
{
"type": "http",
"direction": "out",
"name": "$return"
},
{
"type": "queue",
"direction": "out",
"name": "msg",
"queueName": "processing",
"connection": "QueueConnectionString"
}
]
}
and the local.settings.json
file to
{
"IsEncrypted": false,
"Values": {
"FUNCTIONS_WORKER_RUNTIME": "python",
"AzureWebJobsStorage": "{AzureWebJobsStorage}",
"QueueConnectionString": "<Connection String for Storage Account>",
"STORAGE_CONTAINER_NAME": "testdocs",
"STORAGE_TABLE_NAME": "status"
}
}
We also then added a check to see if the country name was already in English. The intention here was to cut down on calls to the translate
function.
After each of these changes we redeployed to the functions app and tested again. Same result. The function will run, and print output to terminal, however after a few seconds it will show a 503 error and eventually fail.
I can show a code sample but cannot provide the tables unfortunately.
from snowflake import connector
import pandas as pd
import pyarrow
from geopy.geocoders import Nominatim
from deep_translator import GoogleTranslator
from pprint import pprint
import langdetect
import logging
import azure.functions as func
def main(req: func.HttpRequest, msg: func.Out[func.QueueMessage]) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
# Connecting string to Snowflake
conn = connector.connect(
user='<USER>',
password='<PASSWORD>',
account='<ACCOUNT>',
warehouse='<WH>',
database='<DB>',
schema='<SCH>'
)
# Creating objects for Snowflake, Geolocation, Translate
cur = conn.cursor()
geolocator = Nominatim(user_agent="geoapiExercises")
translator = GoogleTranslator(target='en')
# Fetching weblog data to get the current latlong list
fetchsql = "SELECT PAGELATLONG FROM <TABLE_NAME> WHERE PAGELATLONG IS NOT NULL GROUP BY PAGELATLONG;"
logging.info(fetchsql)
cur.execute(fetchsql)
df = pd.DataFrame(cur.fetchall(), columns = ['PageLatLong'])
logging.info('created data frame')
# Creating and Inserting the mapping into final table
for index, row in df.iterrows():
latlong = row['PageLatLong']
location = geolocator.reverse(row['PageLatLong']).raw['address']
logging.info('got addresses')
city = str(location.get('state_district'))
country = str(location.get('country'))
countrycd = str(location.get('country_code'))
logging.info('got countries')
# detect language of country
res = langdetect.detect_langs(country)
lang = str(res[0]).split(':')[0]
conf = str(res[0]).split(':')[0]
if lang != 'en' and conf > 0.99:
country = translator.translate(country)
logging.info('translated non-english country names')
insertstmt = "INSERT INTO <RESULTS_TABLE> VALUES('"+latlong+"','"+city+"','"+country+"','"+countrycd+"')"
logging.info(insertstmt)
try:
cur.execute(insertstmt)
except Exception:
pass
return func.HttpResponse("success")
If anyone had an idea what may be causing this issue I'd appreciate any suggestions.
Thanks.