11

I would like to implement an (open source) web application, where the user sends some kind of request via his browser to a Python web application. The request data is used to define and submit some kind of heavy computing job. Computing jobs are outsourced to a "worker backend" (also Python). During job processing, the job goes through different stages over time (from "submitted" over intermediate states to "finished", ideally). What I would like to accomplish is to display the current job state to the user in real time. This means that the worker backend has to communicate job states back to the web application. The web application then has to push information to the user's browser. I've brought a picture for you that schematically describes the basic idea: schematic problem description

The numbers in red circles indicate the chronological order of events. "web app" and "worker backend" are still to be designed. Now, I would be grateful when you can help me with some technology decisions.

My questions, specifically:

  1. Which messaging technology should I apply between web app and worker backend? When the worker backend emits a signal (a message of some kind) about a certain job, it must trigger some event in the web application. Hence, I need some kind of callback that is associated with the client that initially has requested job submission. I think I need some pub/sub mechanism here, where the worker backend publishes and the web app subscribes. When the web app receives a message, it reacts on it by sending a status update to the client. I want the worker backend to be scalable and strongly decoupled from the web application. Hence, I was thinking about using Redis or ZeroMQ for this task. What do you think? Is my whole approach a bit too complicated?

  2. Which technology should I use for pushing information to the browser? Just out of perfectionism I'd like to have real-time updates. I do not want to poll with a high frequency. I want immediate push to the client when the worker backend emits a message :-). Also, I do not need maximum browser support. This project first of all is more or less a techdemo for myself. Should I go for HTML5 server-sent events / websockets? Or would you recommend otherwise?

Big big thanks for your recommendations in advance.

Dr. Jan-Philip Gehrcke
  • 33,287
  • 14
  • 85
  • 130

3 Answers3

4

An option would be using WebSocket. If you go that road, you might check out Autobahn, which includes clients and servers for Python (Twisted), as well a an RPC+PubSub protocol on top of WebSocket (with libs for Python, JavaScript and Android). Using an RPC+PubSub subscription safes significant work, and might fit your needs (job submission => RPC, job work updates => PubSub).

AutobahnPython runs on Twisted, which can additionally act as a WSGI container which makes possible to run Flask (or other WSGI based Web framework). You can run everything on 1 port/server. There is an example on GitHub Autobahn repository for the latter.

Disclaimer: I am original author of Autobahn and WAMP, and work for Tavendo.

Details: I assume your workers do CPU intensive and/or blocking stuff.

First, are your workers pure Python or external programs?

If the latter, you can use Twisted process protocol instances which communicate via stdio pipes (in a non-blocking manner) from the main Twisted thread. If the former, you can use the Twisted background thread pool and use Twisted deferToThread (see: http://twistedmatrix.com/documents/current/core/howto/threading.html).

Autobahn runs on the main Twisted reactor thread. If your worker does also (see comments before), then you can directly call methods on the WebSocket/WAMP factory/protocol instances. If not (worker runs on background thread), you should call those methods via callFromThread.

If you use WAMP, the main thing is to get a reference for the WampServerFactory to each worker. The worker then can dispatch a PubSub event to all subscribers by calling the appropriate factory method.

oberstet
  • 21,353
  • 10
  • 64
  • 97
3

Since you are talking about a python web application, I would recommend you look into:

Which messaging technology should I apply between web app and worker backend?

Celery - break down your jobs into smaller tasks which return results that need to be shown to the client

Which technology should I use for pushing information to the browser?

Either Socket IO on NodeJS kind of server side JS framework or a web socket library for you python web framework

If you are not tied to python too much, check out Meteor

Based on this thread, other ways to update progress from server to web client in real time could include writing the progress status to a redis database or using Oribited/ Morbid (Both based on Twisted) using the STOMP protocol based on asynchronous results from celery's subtasks

Pratik Mandrekar
  • 9,362
  • 4
  • 45
  • 65
  • Thanks for your answer and recommending Celery. However, it could be an overkill. I'll have to further look into it. Do you think that there is a way to use Celery for communicating intermediate job states? I do not want to have a ping pong between web app and worker backend, i.e. I'd rather not divide my jobs into sub-jobs. If I go for websockets, I think I'd use gevent and gevent-websocket or ws4py. – Dr. Jan-Philip Gehrcke Oct 05 '12 at 13:18
  • Maybe the web sockets and gevent is a better way to do it. I just added a little more information on how celery and STOMP based messaging could be used to achieve the same to or writing the job status to a database that can be updated to the client via short-polling or something. – Pratik Mandrekar Oct 05 '12 at 15:01
3

To be of any use, your web application is going to have a database. I would create a table in that database that is specifically for these jobs. You'd have a 'State' for each job.

This simplifies your system because you can just send off your request to start a job and hand it over to the backend workers (zmq is a good solution for this IMO). Since you're using python for the back-end, it's very trivial to get your worker jobs to either update its current working job in the database or have another 'updater' whose only job is updating fields in the database (keeping the logic separate will make for a better solution, allows you to possibly start up multiple 'updater' if you're doing a lot of updating)

Then for your frontend, since you do not want to poll the server, I'd do something of a 'long poll'. What you're essentially doing is polling the server but the server nevers actually 'responds' until there is a change in the data you're interested in. As soon as there is a change, you respond with the request. At the frontend, you have your JS re-make the connection as soon as it recieves the latest update. This solution is cross-browser compliant so long as you use a JS framework that is cross-browser as well (I would suggest jQuery).


To eliminate the web apps database polling, do the following:

make the initial request a long poll request to the web app, The web app sends off a zmq message to your backend (probably would need to be done with a REQ/REP socket) and waits. It waits until it gets a message from the zmq backend with a state change. When it gets a state change it responds to the frontend with the change. At this point, the frontend will send out a new long-poll request (with this jobs current id which can be its identity) and the web app will reconnect to the backend and wait for another state change. The trick to make this work is to use ZMQ's ZMQ_IDENTITY for the socket when it was originally created (in the first request). This will allow the web app to reconnect to the same backend socket and get new updates. When the backend has a new update to send, it will signal the web app which in turn will respond to the long-poll request with its state change. This way there is no polling, no backend database and everything is event driven from the backend workers.

I'd setup some sort of watch-dog that if the frontend goes away (switches pages or closes browser), the backend sockets will be properly closed. No need for them to sit there indefintely blocking when it's changed state.

g19fanatic
  • 10,567
  • 6
  • 33
  • 63
  • Thanks for your answer. Right, long polling might be "enough" for this problem. Can you think of a good way to implement the "callback behavior" that triggers sending a response to the browser when a zmq message arrives at the web app from the worker backend? – Dr. Jan-Philip Gehrcke Oct 05 '12 at 13:22
  • In my described solution, there is no zmq message that goes between the backend and the web app. The backend workers (either by themselves or through another db backend piece) directly update the database with its current jobs state. The web app is what is essentially polling the database looking for a state change in the job that the front end sent its long poll request for. The beauty is that there is almost no latency (db's are usually hosted localhost related to web servers) on the server and is essentiall real-time to the frontend. – g19fanatic Oct 05 '12 at 13:28
  • So your plan includes frequent polling of the database from within the web app. I've already implemented a prototype a few months ago using a Redis DB holding the job states and being polled frequently in order to identify job state changes. While this works, I do not think that it is an elegant solution. I do not need a database for other purposes. In my opinion an elegant solution is purely event-based with no superfluous polling. – Dr. Jan-Philip Gehrcke Oct 05 '12 at 13:36
  • Check the answer, I added some more information that will eliminate the web-app polling AND will eliminate the need for a database. Use a unique id for each job and use that for the ZMQ_IDENTITY. Pass it on to the browser for the first request and the frontend will be able to generically request the current status based upon this 'identity' through the web-app. – g19fanatic Oct 05 '12 at 14:15
  • All answers are great and help me during system design. You get the green checkmark for pointing out that long polling in this case is all I need. – Dr. Jan-Philip Gehrcke Oct 12 '12 at 20:48