-2

So we have a Django backend server that communicates with a large number of 3rd party servers. These 3rd party servers all use socket connections for communication, not HTTP. Each user request to our service will communicate (over a socket) with one of many possible 3rd party servers. Each 3rd party server can take 1-20 seconds to respond, although usually it is around 1-5 seconds. Of course when a user goes to our webpage to make a request for one of those 3rd party services, we want to respond to our user as quickly as possible, ie don't block on our server waiting for a response. When the 3rd party server responds, then we want to push the response back to the user's browser.

Certainly this is a common problem. But the key here is that we will be issuing requests to our web server every 5 seconds or so (e.g. using JavaScript/AJAX in our web pages). I understand that if we were able to create a websocket connection for the response and leave it open, and/or if our requests were really long duration (say >30 seconds), then websockets would be a good way to go for server push. However, for a variety of reasons we can't do that, so we'll need to establish a new websocket connection every request. It seems to me that if we have to go through a whole process every 5 seconds of opening a websocket, configuring the websocket to match up with the proper 3rd party service connection, sending a command down, proxying that command to the correct 3rd party service, get the response, proxy that to the proper websocket, and send the response back to the browser, and then do this all again 5 seconds later, is this really any better than just doing basic short polling? Our short polling approach would be to use a basic AJAX call to send the request down and then return success to the browser. The backend would proxy to the proper 3rd party service, and when the results come in, it would save these results to a MYSQL table. Our AJAX would just send a polling command every second or two (possibly with backoff, eg 1,1,2,4,6,10,... seconds), until the response is received or until timeout occurs. The implementation for this would certainly be WAY simpler and would be pretty much guaranteed to work. In the vast majority of cases, we would issue our "once every 5 second" command, and would get a response back after the first or second polling attempt. If we used websockets, it would take a connection attempt plus one or two socket write commands to properly configure the backend proxy to use the proper backend service, and then we'd get the response and the socket would close and we'd have to do it all over again 5 seconds later.

So wouldn't short-polling work just fine and possibly better in this situation?

Marc
  • 3,386
  • 8
  • 44
  • 68
  • What kind of data you get from third-party servers? Is it shared among users? Is it unique for each request? Could you tell us an example (e.g. "third-party servers provide foreign currency exchange rates") – Max Malysh Jul 28 '17 at 02:26

1 Answers1

1

If you're just going to set up a new webSocket connection for every single request, then it is more overhead to use a webSocket than an http request.

Every webSocket connection starts with an http request/response just to get the webSocket initialized and then you want to send more packets - that's more back and forth than just an http request/response.

The real savings here would be to solve whatever issues are making you think you need to create a new webSocket connection for every request. That is likely solvable. Then, you create one webSocket connection and just send messages both ways to send requests and return responses and that would definitely be more efficient than http polling.

To really get more efficient than http, you stop polling entirely from the client. So, rather than the client saying "do you have anything new for me" every 5 seconds, the client just establishes a webSocket connection, instructs the server about what it wants to be notified about, the server then does whatever polling needs to be done and whenever there is something that it thinks the client wants, it just sends it to the client without waiting for a polling interval. The client gets data faster (it doesn't have to wait for the next polling interval), there are no empty polling requests/responses so server usage and bandwidth usage are both better.

jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • @Marc - Does this answer your question? If so, you can indicate that to the community by clicking the green checkmark to the left of the answer and that will also earn you some reputation points. – jfriend00 Aug 18 '17 at 06:11