2

I am running a python application on the App Engine using Django. Additionally, I am using a session-management library called gae-sessions. If threadsafe is set to "no", there is no problem, but when threadsafe is set to "yes", I occasionally see a problem with sessions being lost.

The issue that I am seeing is that when treading is enabled, multiple requests are ocassionally interleaved in GAE-Sessions middleware.

Within the gae-sessions library, there is a variable called _tls, which is a threading.local() variable. When a user makes an http request to the website, a function called process_request() is first run, followed by a bunch of custom html generation for the current page, and then a function called process_response() is run. State is remembered between the process_request and process_response in the _tls "thread safe" variable. I am able to check uniqueness of the _tls variable by printing out the _tls value (eg. "<thread._local object at 0xfc2e8de0>").

What I am occasionally witnessing is that on what appears to be a single thread in the GAE-Sessions middleware (inferred to be a single thread by the fact that they have the same memory location for the thread_local object, and inferred by the fact that data from one request appears to be overwriting data from another requst), multiple http requests are being interleaved. Given User1 and User2 that make a request at the same time, I have witnessed the following execution order:

User1 -> `process_request` is executed on thread A
User2 -> `process_request` is executed on thread A
User2 -> `process_response` is executed on thread A
User1 -> `process_response` is executed on thread A

Given the above scenario, the User2 session stomps on some internal variables and causes the session of User1 to be lost.

So, my question is the following: 1) Is this interleaving of different requests in the middleware expected behaviour in App-Engine/Django/Python? (or am I totally confused, and there is something else going on here) 2) At what level is this interleaving happening (App-Engine/Django/Python)?

I am quite surprised by seeing this behaviour, and so would be interested to understand why/what is happening here.

Alexander Marquardt
  • 1,539
  • 15
  • 30
  • I think that the following is relevant: http://stackoverflow.com/questions/6214509/is-django-middleware-thread-safe - so, it appears that Django Middleware is not thread-safe, which could explain the above behaviour. – Alexander Marquardt Jan 05 '13 at 01:48
  • And this is relevant as well: http://blog.roseman.org.uk/2010/02/01/middleware-post-processing-django-gotcha/ – Alexander Marquardt Jan 05 '13 at 11:21
  • Is it happening on dev_appserver or in production? If it's in production, there's a possibility that the two requests are running on separate instances. – dragonx Jan 05 '13 at 16:17
  • It is happening in production -- but if they were running on seperate instances, I would expect the thread._local object be at different memory locations for different instances ... – Alexander Marquardt Jan 05 '13 at 22:05
  • If both instances were identical, then it's quite possible that the first thread on each location would have the thread_local object at the same address, on two different machines. In either case, have yuo tried the solution to the other question and used the request object instead of tls? – dragonx Jan 05 '13 at 22:14
  • Yes, I have fixed this particular problem by using the request object to pass data between the request and response. However, my website is multi-lingual, and it looks to me like Django sets language on a per-thread basis, so I am worried (if multiple requests are running on a single thread) that the language settings might be corrupted by interfering requests... it would be good to understand what is going on here .. – Alexander Marquardt Jan 05 '13 at 22:57
  • If they were running on different instances then I wouldn't have expected the data from one request to be corrupted by a different request. – Alexander Marquardt Jan 05 '13 at 23:12
  • That's correct, you might get blank data, but not corrupted data. – dragonx Jan 05 '13 at 23:36
  • By "corrupted", I meant data that is written into a different session than the one that it was intended for. – Alexander Marquardt Jan 06 '13 at 02:27
  • The following article gives a good overview of how threading works in the app engine: http://blog.notdot.net/2011/10/Migrating-to-Python-2-7-part-1-Threadsafe – Alexander Marquardt Jan 06 '13 at 21:36

1 Answers1

2

I found the following links to be helpful in understanding what is happening:

Assuming that I am understanding everything correctly, the reason that the above happened is the following:

1) When Django is running, it runs most of the base functionality in a parent (common) thread that includes the Django Middleware.

2) Individual requests are run in child threads which can interact with the parent thread.

The result of the above is that requests (child threads) can indeed be interleaved within the Middleware - and this is by design (only running a single copy of Django and the Middleware would save memory, be more efficient, etc.). [see the first article that I linked to in this answer for a quick description of how threading and child/parent processes interact]

With respect to GAE-Sessions - the thread that we were examining was the same for different requests, given that it was the parent thread (common for all children/requests), as opposed to the child threads that we were looking at each time that the middleware was entered.

GAE-Sessions was storing state data in the middleware, which could be over-written by different requests, given the possible interleaving of the child threads within the parent (Django + Middlware) thread. The fix that I applied to GAE-Sessions was to store all state data on the request object, as opposed to within the middlware.

Fixes: previously a writable reference to response handler functions was stored in the DjangoSessionMiddlware object as self.response_handlers - which I have moved to the request object as request.response_handlers. I also removed the _tls variable, and moved data that it contained into the request object.

Community
  • 1
  • 1
Alexander Marquardt
  • 1,539
  • 15
  • 30