6

I have an app that will convert audio file to text. Using flask and flask-socketio. It works perfectly when I run it using: "python run.py", but when I run it using: "gunicorn -k eventlet -b 0.0.0.0:5000 run:app" it will stop on the part where it calls the google speech to text api in audio.py file.

These are the current codes right now.

run.py:

from ats import socketio, app, db

if __name__ == '__main__':
    db.create_all()
    socketio.run(app, host='0.0.0.0', port=5001, debug=True)

init.py

import logging, json

from flask import Flask, jsonify, render_template, request
from flask_socketio import SocketIO, emit, send
from flask_cors import CORS
from flask_sqlalchemy import SQLAlchemy
from flask_marshmallow import Marshmall

app = Flask(__name__, instance_relative_config=True, static_folder="templates/static", template_folder="templates")

# Create db instance
db = SQLAlchemy(app)
ma = Marshmallow(app)

@app.route('/')
def index():
    return render_template('index.html');

# import models
from ats import models

# set up CORS
CORS(app)
socketio = SocketIO(app, cors_allowed_origins='*', async_mode='eventlet')


# import blueprints
from ats.product.product import product_blueprint

# register blueprints
app.register_blueprint(product_blueprint, url_prefix='/api/product')

from ats import error_handlers

product.py

import os
import math
import eventlet
from os.path import join
from flask import Blueprint, request, jsonify, abort
from ats.utils import audio as AUDIO

product_blueprint = Blueprint('product', __name__)

@product_blueprint.route('/add', methods=['post'])
def addProduct():
    try:
        data = request.form

        foldername = data['name']
        scriptFile = request.files['script']
        audioFile = request.files['audio']

        # save the script and audio file to uploads folder
        FILE.createFolder(foldername)
        FILE.save(foldername, scriptFile)
        FILE.save(foldername, audioFile)

        # list the files in the uploads
        audioFiles = FILE.getAudioFileList(foldername)

        fileCount = len(audioFiles)
        currentFile = 1
        # ============ speech to text =============
        for file in audioFiles:
            recognizedText = AUDIO.convert(foldername, file)

            # save to database
            newAudio = {
                'name': file,
                'recognizedText': recognizedText,
                'length': duration,
            }
            Audio.add(newAudio)

            # emit event to update the client about the progress
            percent = math.floor((currentFile / float(fileCount) ) * 100) 
            emit('upload_progress', {'data': percent}, room=data['sid'], namespace='/')
            eventlet.sleep()
            currentFile += 1

        # Delete the files in uploads folder
        FILE.delete(foldername)

        return jsonify({'data': None, 'message': 'Product was added.', 'success': True}), 200
    except Exception as e:
        abort(500, str(e))

audio.py

import os
from ats import app

# Imports the Google Cloud client library
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types

# Instantiates a client
client = speech.SpeechClient()

def convert(foldername, filename):
    try:

        file = os.path.join(app.config['UPLOAD_FOLDER'], foldername, filename)

        # Loads the audio into memory
        with io.open(file, 'rb') as audio_file:
            content = audio_file.read()
            audio = types.RecognitionAudio(content=content)

        config = types.RecognitionConfig(
            encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
            sample_rate_hertz=48000,
            language_code='ja-JP')

        # Call speech in the audio file
        response = client.recognize(config, audio) # The code will stop here, that results to worker timeout in gunicorn

        return response
    except Exception as e:
        raise e

I've been searching solution for almost a week but I still couldn't find one. THank you for you're help guys.

Hokage Sama
  • 153
  • 1
  • 2
  • 12
  • 1
    Are you monkey patching the standard library so that it is compatible with eventlet? – Miguel Grinberg Feb 04 '20 at 12:29
  • Yes I added monkey patching in my __init__ file at the top but still it doesn't work. I just use uwsgi with gevent instead of gunicorn to run it and created a service in systemd in ubuntu server. And it works now. Thanks – Hokage Sama Feb 05 '20 at 03:25

4 Answers4

7

When you run your application directly using python run.py there is no timeout applied the application takes whatever time it needs to process, however when you run your application using Gunicorn, the default timeout is 30 seconds which means that you will get a timeout error incase your application does not respond within 30 seconds. To avoid this you can increase the default timeout set by Gunicorn by adding --timeout <timeinterval-in-seconds>

The following command sets the timeout to 10 mins

gunicorn -k eventlet -b 0.0.0.0:5000 --timeout 600 run:app

  • Thank you for your feedback sir, I will try this one and update you later on if it will work. – Hokage Sama Feb 03 '20 at 09:33
  • Here's the update sir, I tried it with --timeout 600 but it will just wait 600 seconds before the timeout will appear. The conversion of audio to text is usually about 1 to 2 seconds. So i think there is another problem here. – Hokage Sama Feb 03 '20 at 09:44
  • Giving a timeout of 600 seconds does not make the server wait 600 seconds to respond, rather it waits for a maximum time of 600 seconds and then gives a timeout if the server still does not respond. Timeout only occurs when your server is not responding back to the client within the specified time. –  Feb 03 '20 at 09:48
  • There's something else wrong with running Google speech to text with gunicorn. When I run with just flask the request completed within 2 minutes. Somehow running Google speech to text with gunicorn makes the program stall... – Pinyi Wang Jul 08 '20 at 08:48
  • timeout can be also set in gunicorn.conf too – Andrew_Dublin Aug 31 '21 at 15:22
2

It's working now, by running it using uwsgi instead of gunicorn. Here's the config, service and nginx

ats.ini

[uwsgi]
module = wsgi:app

master = true
processes = 1

socket = ats.sock
chmod-socket = 660
vacuum = true

die-on-term = true

/etc/systemd/system/ats.service

[Unit]
Description=uWSGI instance to serve ats
After=network.target

[Service]
User=ubuntu
Group=www-data
WorkingDirectory=/home/user/ats
Environment="PATH=/home/user/ats/env/bin"
ExecStart=/home/user/ats/env/bin/uwsgi --ini ats.ini --gevent 100

[Install]
WantedBy=multi-user.target

nginx

server {
    listen 80;
    server_name <ip_address or domain>;
    access_log  /var/log/nginx/access.log;

    location / {
        include uwsgi_params;
        uwsgi_pass unix:/home/user/ats/ats.sock;
        proxy_set_header Connection "Upgrade";

        client_max_body_size 200M;
    }

    location /socket.io {
        include uwsgi_params;
        uwsgi_pass unix:/home/user/ats/ats.sock;
        proxy_set_header Connection "Upgrade";
    }
}

Thank you guys

Hokage Sama
  • 153
  • 1
  • 2
  • 12
0

Google cloud python had some conflict with gevent. I found out from this thread that in order for them to work, you need to add the following in the beginning of init.py:

from gevent import monkey
monkey.patch_all()

import grpc.experimental.gevent as grpc_gevent
grpc_gevent.init_gevent()
Pinyi Wang
  • 823
  • 5
  • 14
0

I met this problem too today, finally I found that the bug caused by proxy setting. at first, I set my proxy is "",

os.environ['http_proxy'] = ""
os.environ['https_proxy'] = ""

and I get the error about time out in request, after I comment the code and it works

# os.environ['http_proxy'] = ""
# os.environ['https_proxy'] = ""

I think it is not an error about gunicore timeout default setting, it is about system proxy setting.

Peritract
  • 761
  • 5
  • 13
zhicheng duan
  • 81
  • 1
  • 3