0

I have written a small Flask app to stream multiple log files to a browser over the internet.

import json
import os
import re
import flask
from shelljob import proc

import eventlet
eventlet.sleep()
eventlet.monkey_patch()

app = flask.Flask(__name__)

@app.route( '/stream/<string:case_name>/<string:wind_dir>' )
def stream(case_name, wind_dir):
    g = proc.Group()
    foamrun = os.environ["FOAM_RUN"]
    foamcase = os.path.join(foamrun, case_name, wind_dir)
    log_file = os.path.join(foamcase, 'logs', 'run.log')
    print log_file
    p = g.run( [ "tail", "-f", log_file ] )
    def read_process():
        while g.is_pending():
            lines = g.readlines()
            for proc, line in lines:
                # process line and create payload
                yield "data:" + json.dumps(payload) + "\n\n"

    return flask.Response( read_process(), mimetype='text/event-stream' )

@app.after_request
def after_request(response):
  response.headers.add('Access-Control-Allow-Origin', '*')
  response.headers.add('Access-Control-Allow-Headers', 'Content-Type,Authorization')
  response.headers.add('Access-Control-Allow-Methods', 'GET')
  return response

if __name__ == "__main__":
    foamrun = os.environ["FOAM_RUN"]
    app.run(threaded=True, host='0.0.0.0', port=9001)

I run this app with gunicorn with the command

gunicorn server:app -k eventlet -b 0.0.0.0:9001

When I open the two links:

http://X.X.X.X:9001/stream/test01_base_Baseline/NW
http://X.X.X.X:9001/stream/test01_base_Baseline/N

I have a strange behaviour. One of the two streams works as I expect, but the second one hangs or is streamed in bulks. For example, on the first page I receive a line each second, while on the second page I receive around 15-20 lines every 20 seconds or so. The behaviour is also not consistent. Sometimes it is the first page that hangs and the second that behaves regularly.

I am quite new to web development.

EDIT

I have tried to replace read_process with a much simpler version

def read_process():
    i = 1
    while True:
       payload = 'line' + str(i)
       i += 1
       yield "data:" + json.dumps(payload) + "\n\n"
       sleep(1)

This version does not havethe same issue and behaves as I would expect. The two streams are received together.

Rojj
  • 1,170
  • 1
  • 12
  • 32
  • Please try to execute `eventlet.monkey_patch()` as early as possible - in first line. If that doesn't help, try to offload shelljob into threadpool. `g = eventlet.tpool.Proxy(proc.Group())` – temoto Nov 16 '17 at 10:14
  • No difference moving `eventlet.monkey_patch()` on the first line. The second option actually makes it worse. They both get stuck and I receive the error: `File "/usr/local/lib/python2.7/dist-packages/eventlet/queue.py", line 118, in switch, self.greenlet.switch(value), error: cannot switch to a different thread` – Rojj Nov 16 '17 at 18:13
  • please try threadpooling only readlines. `lines = eventlet.tpool.execute(g.readlines)` – temoto Nov 16 '17 at 18:45
  • same behaviour unfortunately – Rojj Nov 16 '17 at 19:08
  • Same `both stuck and error`? – temoto Nov 16 '17 at 19:13
  • Same error. With the `tpool` approach one of the two streams does not even start – Rojj Nov 16 '17 at 19:16

1 Answers1

0

What's happening g.readlines() blocks until more data is available from tail process. Unfortunately, it's blocking whole program instead of one green thread.

It should be possible to use eventlet.tpool.Proxy(open(log_file, 'rb')) without tail process.

If that fails, next best option right now is to do file operations in separate OS thread and communicate data via global variables and synchronize access to said variables via local socket. I know that's lame because it repeats half code from shelljob and another half from eventlet.tpool. Sorry, we have a bug in tpool that prevents easier solution.

temoto
  • 5,394
  • 3
  • 34
  • 50