1

I'm constructing my app such that each user has their own database (for easy isolation, and to minimize the need for sharding). This means that each web request, and all of the background scripts, need to connect to a different database based on which user is making the request, and use that connection for all function calls.

I figure I can make some sort of middleware that would pass the right connection to my web requests by attaching it to the request variable, but I don't know how I should ensure that all functions and model methods called by the request use this connection.

spiffytech
  • 6,161
  • 7
  • 41
  • 57
  • 2
    The hard way. Which is why no one does this. – Ignacio Vazquez-Abrams Sep 21 '13 at 19:25
  • 1
    I have a hard time seeing a valid usecase for this (except maybe a database hacking competition or something) – Joran Beasley Sep 21 '13 at 19:41
  • The assumption that multi-tenancy is easier to achieve using multiple databases alone is kind of cargo cult security thinking. – Paulo Scardine Sep 21 '13 at 20:04
  • Here's a related question: http://stackoverflow.com/questions/13372001/multi-tenancy-with-sqlalchemy - the approach I ended up using was to use a separate Postgres schema (a.k.a. namespace) per tenant. I'm pretty sure you can configure SQLALchemy to maintain a pool of connections to different databases and to check out a correct connection on each new request but you would really need to test how the app works with hundreds or thousands of tenants – Sergey Sep 21 '13 at 22:38

1 Answers1

1

Well how to "ensure that all functions and model methods called by the request use this connection" is easy. You pass the connection into your api as with any well-designed code that isn't relying on global variables for such things. So you have a database session object loaded per-request, and you pass it down. It's very easy for model objects to turtle that session object further without explicitly passing it because each managed object knows what session owns it, and you can query it from there.

db = request.db
user = db.query(User).get(1)
user.add_group('foo')

class User(Base):

    def add_group(self, name):
        db = sqlalchemy.orm.object_session(self)
        group = Group(name=name)
        db.add(group)

I'm not recommending you use that exact pattern but it serves as an example of how to grab the session from a managed object, avoiding having to pass the session everywhere explicitly.

On to your original question, how to handle multi-tenancy... In your data model! Designing a system where you are splitting things up at that low of a level is a big maintenance burden and it does not scale well. For example it becomes very difficult to use any type of connection pooling when you have an arbitrary number of independent connections. To get around that people commonly use the SQL SCHEMA feature supported by some databases. That allows you to use the same connection but have access to a different table structure per session. That's better, but again managing all of those schemas independently should raise some red flags, violating DRY with all of that duplication in your data model. Any duplication at that level quickly becomes a burden that you need to be ready for.

Michael Merickel
  • 23,153
  • 3
  • 54
  • 70