1

I have an interesting project going on at our workplace. The task, that stands before us, is such:

  • Build a custom server using Python
  • It has a web server part, serving REST
  • It has a FTP server part, serving files
  • It has a SMTP part, which receives mail only
  • and last but not least, a it has a background worker that manages lowlevel file IO based on requests received from the above mentioned services

Obviously the go to place was Twisted library/framework, which is an excelent networking tool. However, studying the docs further, a few things came up that I'm not sure about.

Having Java background, I would solve the task (at least at the beginning) by spawning a separate thread for each service and going from there. Being in Python however, I cannot do that for any reasonable purpose as Python has GIL. I'm not sure, how Twisted handles this. I would expect, that Twisted has large (if not majority) code written in C, where GIL is not the issue, but that I couldn't find the docs explained to my satisfaction.

So the most oustanding question is: Given that Twisted uses Reactor as it's main design pattern, will it be able to:

  1. Serve all those services needed
  2. Do it in a non-blocking fashion (it should, according to docs, but if someone could elaborate, I'd be grateful)
  3. Be able to serve about few hundreds of clients at once
  4. Serve large file downloads in a reasonable way, meaning that it can serve multiple clients, using multiple services, downloading and uploading large files.

Large files being in the order of hundres of MB, or few GB. The size is not important, it's the time that the client has to stay connected to the server that matters.

Edit: I'm actually inclined to go the way of python multiprocessing, but not sure, whether that's a correct thing to do with Twisted etc.

Tomáš Plešek
  • 1,482
  • 2
  • 12
  • 21
  • Yes, you could write a Twisted app to serve all those services, I'm not sure why you would want/need all those things in a monolithic app though. – MattH Dec 09 '11 at 10:50
  • Well, there is a reason, but I didn't care to elaborate as that's not the issue. But there is .) – Tomáš Plešek Dec 09 '11 at 10:54
  • *The size is not important, it's the time that the client has to stay connected to the server that matters.* I'm not sure what you're implying. How is the time that the client has to stay connected important? And how is that not related to file size? – MattH Dec 09 '11 at 12:27
  • Well, what I meant by that is the fact, that when a client is actively connected to the server, it occupies a slot which cannot be occupied by someone else and therefore, the server has to handle that somehow, which should the lib do fine. The sentence was badly phrased, sorry for that. It is related to size, of course. – Tomáš Plešek Dec 09 '11 at 20:09

1 Answers1

3
  • Serve all those services needed

Yes.

  • Do it in a non-blocking fashion (it should, according to docs, but if someone could elaborate, I'd be grateful)

Twisted's uses the common reactor model. I/O goes through your choice of poll, select, whatever to determine if data is available. It handles only what is available, and passes the data along to other stages of your app. This is how it is non-blocking.

I don't think it provides non-blocking disk I/O, but I'm not sure. That feature not what most people need when they say non-blocking.

  • Be able to serve about few hundreds of clients at once

Yes. No. Maybe. What are those clients doing? Is each hitting refresh every second on a browser making 100 requests? Is each one doing a numerical simulation of galaxy collisions? Is each sending the string "hi!" to the server, without expecting a response?

Twisted can easily handle 1000+ requests per second.

  • Serve large file downloads in a reasonable way, meaning that it can serve multiple clients, using multiple services, downloading and uploading large files.

Sure. For example, the original version of BitTorrent was written in Twisted.

Andrew Dalke
  • 14,889
  • 4
  • 39
  • 54
  • Adrew, thank you for your insigts, let me elaborate more. As for the non-blocking explanation, I get what you're saying, was a bit confused how the reactor and producers/consumers work etc, now I see the light. What the clients are doing, that I explained too broadly. Mostly they'll be uploading/downloading files, through PUT/GET meaning having a connection open and pumping data. What I'm was was concerned about was, whether it's ok to hook several different services onto the reactor. By the tone of your answer, it look like I should be okay. – Tomáš Plešek Dec 09 '11 at 20:13