2

I'm running a simple Thrift server (http://thrift.apache.org/) as a cross-language platform between Python (the server) and Haskell (the client). The only data structure that needs to be sent across is a 3-tuple of doubles, so the server/client implementation is also very simple - it was sufficient to just follow the tutorials.

However, it is really, really slow! I'm getting response times of about 0.5s for each server response, when I require times of about 0.1s or lower.

Does anyone have any ideas on how to speed this up? You can see my simple server implementation below:

  1 import sys
  2 
  3 from vision import Vision
  4 from vision.ttypes import *
  5 
  6 from thrift.transport import TSocket
  7 from thrift.transport import TTransport
  8 from thrift.protocol import TBinaryProtocol
  9 from thrift.protocol.TBinaryProtocol import TBinaryProtocolAccelerated
 10 from thrift.server import TServer
 11 
 12 class VisionHandler:
 13   def observe(self):
 14     ret = Position()
 15     ret.x,ret.y,ret.z = (1,2,3)
 16     return ret
 17     
 18 ret = Position()
 20 handler = VisionHandler()
 21 processor = Vision.Processor(handler)
 22 transport = TSocket.TServerSocket(port=9090)
 23 tfactory = TTransport.TBufferedTransportFactory()
 24 pfactory = TBinaryProtocol.TBinaryProtocolFactory()
 25 
 26 server = TServer.TSimpleServer(processor, transport, tfactory, pfactory)
 27 
 28 print 'Starting the vision server...'
 29 server.serve()
 30 print 'done.'

The client simply queries this server by running

36   client = do
37     handle <- hOpen ("localhost", PortNumber 9090)
38     let binProto = BinaryProtocol handle
39     return (binProto, binProto)

and then

res <- Client.observe =<< client

As far as I'm aware, this is all pretty standard! Why is it so damn slow??

Thanks!

Tetigi
  • 584
  • 1
  • 6
  • 20

2 Answers2

2

Aside from great suggestion in Ellioh answer regarding socket options, one of the problem is that your operation seems a bit to small and fine to be handled over the socket, and most of the time is spent in network and similar latencies. Usually one would try to group your calls to transfer more data and do more work in each call. Granularity is very important in network distributed apps, and you would need to find a good measure for good performance.

Davorin Ruševljan
  • 4,353
  • 21
  • 27
  • Yes this did end up being the problem unfortunately. I ended up going for a much more simpler solution using a local based HTTP server which works nicely enough. – Tetigi Apr 24 '13 at 21:16
1

Most likely that is because of socket options. I don't remember if Thrift allows to set socket options, but setting TCP_NODELAY to switch the congestion control off has a chance of solving the problem.

If this is the same code you use, sockets are easily accessible. Try subclassing TSocket.

The option should be set for socket sending/receiving data on both server (socket that is returned from accept()) and client (the socket created by client) sides. Thrift is not slow, so the problem should not be with serialization, unless you are serializing something really monstrous. That means the problem is with all that "connect, send data, get an answer" stuff. It almost surely should be because of Nagle algorithm that is switched off by TCP_NODELAY.

Ellioh
  • 5,162
  • 2
  • 20
  • 33
  • Hmmm... I'm wondering how to enable this. Unfortunately the Thrift documentation appears to be somewhat lacking. – Tetigi Feb 11 '13 at 11:16
  • Try finding a socket object. Something like setsockopt() should do the job. Look at TSocket first. – Ellioh Feb 11 '13 at 11:17
  • I'm a bit of a rookie when it comes to this kind of thing - I copied the implementation of TServerSocket. I tried adding self.handle.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1) to the handle definition but it's still just as slow... – Tetigi Feb 11 '13 at 12:49
  • Try adding both in server and client. Both sides should make no delay. – Ellioh Feb 11 '13 at 13:06
  • I've got code setting TCP_NODELAY on both sides to 1. It was fairly simple on haskell, but the python code is a bit less obvious to me - I tried sprinkling self.handle.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1) and self.handle.setsockopt(socket.SOL_TCP, socket.TCP_NODELAY, 1) liberally throughout my code but still no luck. – Tetigi Feb 11 '13 at 13:43
  • self.handle.socket in TServerSocket is not the socket that transfers/receives data. It just accepts connections. Try setting NODELAY really everywhere. Otherwise you may try to log time to uderstand which operations gives the delay. The delay of 0.5s is huge, so it is not a problem to find it. – Ellioh Feb 11 '13 at 13:48
  • I've put it everywhere, but still no luck. I'm not even 100% sure that it lies in the Python side to be honest. I'll get back with any alternatives I might try. – Tetigi Feb 11 '13 at 13:54