I am trying to run spark-nlp as azure function.
I have a function app which is run with a docker container. My function app code is run on python and I also install java as I run pyspark within it. I use python's flask within one function to handle incoming requests.
Once the function app starts and container is running, for the first few seconds I get responses for my API calls but after only few seconds (~15-20 seconds) the API calls start timing out due to no response from the server.
The function app is running on dedicated app service plan and is set to 'always on'.
What is the reason for such a behavior?
Here is my function app code:
import logging
import azure.functions as func
# Imports for Spark-NLP
import os
import sys
sys.path.append('/home/site/wwwroot/contextSpellCheck/spark-2.4.7-bin-hadoop2.7/python')
sys.path.append('/home/site/wwwroot/contextSpellCheck/spark-2.4.7-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip')
import sparknlp
from sparknlp.annotator import *
from sparknlp.common import *
from sparknlp.base import *
from sparknlp.annotator import *
from flask import Flask, request
app = Flask(__name__)
spark = sparknlp.start()
documentAssembler = DocumentAssembler().setInputCol("text").setOutputCol("document")
tokenizer = RecursiveTokenizer().setInputCols(["document"]).setOutputCol("token").setPrefixes(["\"", "(", "[", "\n"]).setSuffixes([".", ",", "?", ")", "!", "'s"])
spellModel = ContextSpellCheckerModel.load("/home/site/wwwroot/contextSpellCheck/spellcheck_dl_en_2.5.0_2.4_1588756259065").setInputCols("token").setOutputCol("checked")
finisher = Finisher().setInputCols("checked")
pipeline = Pipeline(stages=[documentAssembler, tokenizer, spellModel, finisher])
empty_ds = spark.createDataFrame([[""]]).toDF("text")
lp = LightPipeline(pipeline.fit(empty_ds))
@app.route('/api/testFunction', methods = ['GET', 'POST'])
def annotate():
global lp
if request.method == 'GET':
text = request.args.get('text')
elif request.method == 'POST':
req_body = request.get_json()
text = req_body['text']
return lp.annotate(text)
def main(req: func.HttpRequest, context: func.Context) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
return func.WsgiMiddleware(app).handle(req, context)