0

I have the following Resource to handle http POST request with twisted web:

class RootResource(Resource):
    isLeaf = True

    def errback(self, failure):
        print "Request finished with error: %s"%str(failure.value)
        return failure

    def write_response_happy(self, result):
        self.request.write('HAPPY!')
        self.request.finish()

    def write_response_unhappy(self, result):
        self.request.write('UNHAPPY!')
        self.request.finish()

    @defer.inlineCallbacks
    def method_1(self):
        #IRL I have many more queries to mySQL, cassandra and memcache to get final result, this is why I use inlineCallbacks to keep the code clean.        
        res = yield dbpool.runQuery('SELECT something FROM table')

        #Now I make a decision based on result of the queries:
        if res: #Doesn't make much sense but that's only an example
            self.d.addCallback(self.write_response_happy) #self.d is already available after yield, so this looks OK? 
        else:
            self.d.addCallback(self.write_response_unhappy)
        returnValue(None)

    def render_POST(self, request):
        self.request = request 
        self.d = self.method_1() 
        self.d.addErrback(self.errback)
        return server.NOT_DONE_YET

root = RootResource()
site = server.Site(root)
reactor.listenTCP(8002, site)
dbpool = adbapi.ConnectionPool('MySQLdb', host='localhost', db='mydb', user='myuser', passwd='mypass', cp_reconnect=True)
print "Serving on 8002"
reactor.run()

I've used the ab tool (from apache utils) to test 5 POST requests one after another:

ab -n 5 -p sample_post.txt http://127.0.0.1:8002/

Works fine!

Then I tried to run the same 5 POST requests simultaneously:

ab -n 5 -c 5 -p sample_post.txt http://127.0.0.1:8002/

Here I'm getting errors: exceptions.RuntimeError: Request.write called on a request after Request.finish was called. What am I doing wrong?

sberry
  • 128,281
  • 18
  • 138
  • 165
PawelRoman
  • 6,122
  • 5
  • 30
  • 40
  • You may have concurrent access to your result logger: closing it with a `finish()` while the other request is trying to write in it. – Mualig Jun 14 '12 at 15:12
  • Yes I do have concurrent access, I run 5 requests at a time. I am new to twisted web, what happens if I have concurrent requests? Shoudln't I get a separate Resource class instance each time? – PawelRoman Jun 14 '12 at 15:25
  • Since you have only 1 `RootResource` linked to your `reactor` who listen to your TCP requests. IMHO you only have 1 `Ressource` created. Try commenting the `Finish()` and doing it just once all requests are received. – Mualig Jun 14 '12 at 15:33
  • Hmm, but in real life when receiving real requests there isn't any point in time where I can say that 'all requests are received'. Requests keep coming all the time, they are of course concurrent, and all must be handled correctly. If I have only one Resource() instance, that means using self.foo won't work, I have to pass entire context (request object, all the data fetched from the database, etc.) thru the entire callback chain (unlike this example IRL I have >1k lines of code and many functions chained) to the final method that does write() and then finish(). Is this how I have to do it? – PawelRoman Jun 14 '12 at 15:44
  • I was just saying try to comment to see if the problem came from multiple Finish. I'm not saying that it's the way to do it. – Mualig Jun 14 '12 at 15:50

1 Answers1

1

As Mualig suggested in his comments, you have only one instance of RootResource. When you assign to self.request and self.d in render_POST, you overwrite whatever value those attributes already had. If two requests arrive at around the same time, then this is a problem. The first Request and Deferred are discarded and replaced by new ones associated with the request that arrives second. Later, when your database operation finishes, the second request gets both results and the first one gets none at all.

This is an example of a general mistake in concurrent programming. Your per-request state is kept where it is shared between multiple requests. When multiple requests are handled concurrently, that sharing turns into a fight, and (at least) one request has to lose.

Try keeping your per-request state where it won't be shared between multiple requests. For example, try keeping it on the Deferred:

class RootResource(Resource):
    isLeaf = True

    def errback(self, failure):
        print "Request finished with error: %s"%str(failure.value)
        # You just handled the error, don't return the failure.
        # Nothing later in the callback chain is doing anything with it.
        # return failure

    def write_response(self, result, request):
        # No "self.request" anymore, just use the argument
        request.write(result)
        request.finish()

    @defer.inlineCallbacks
    def method_1(self):
        #IRL I have many more queries to mySQL, cassandra and memcache to get final result, this is why I use inlineCallbacks to keep the code clean.        
        res = yield dbpool.runQuery('SELECT something FROM table')

        #Now I make a decision based on result of the queries:
        if res: #Doesn't make much sense but that's only an example
            # No "self.d" anymore, just produce a result.  No shared state to confuse.
            returnValue("HAPPY!")
        else:
            returnValue("UNHAPPY!")

    def render_POST(self, request):
        # No more attributes on self.  Just start the operation.
        d = self.method_1() 
        # Push the request object into the Deferred.  It'll be passed to response,
        # which is what needs it.  Each call to method_1 returns a new Deferred,
        # so no shared state here.
        d.addCallback(self.write_response, request)
        d.addErrback(self.errback)
        return server.NOT_DONE_YET
Jean-Paul Calderone
  • 47,755
  • 6
  • 94
  • 122