2

I fully understand the pros and cons of using shared vs separate schemas (or databases in mysql). Looking at those we have chosen to use a shared schema.

Looking for ideas on how we can more easily accomplish multi-tenancy. I am happy to add a key to each table, but that means that we will need to add "where tenant_id = X" for every single table that uses multi-tenancy in every single query. Sounds painful.

A much better approach would be to set some parameter that affects all tables in a query or all queries in a connection. This would avoid having to update all existing queries and having to include checks for tenant ids in future queries.

I had some initial ideas (below), but they all seem pretty painful as well.

  1. Create temporary views for each table that automatically filter by tenant id (possibly using temporary tables?)

  2. Create views that filter by tenant id and dynamically set the table name in the query

  3. Use partitions by tenant id and query individual partitions.

Anyone have any better ideas?

Regular User
  • 682
  • 7
  • 16
  • Have you considered a table name prefix? – Marcus Adams Mar 28 '19 at 19:59
  • So separating each table into multiple tables with a tenant prefix? I don't think it's practical to create a new set of tables for each tenant because that would complicate adding tenants and would make table changes difficult, as we would need to update X tables instead of one. – Regular User Mar 28 '19 at 20:07
  • 1
    You didn't indicate that each tenant has an identical schema. I don't see the issue anyway. With a column in each table, of course, you get to leverage table partitions. And you're probably going to put some kind of wrapper around everything so you don't have to add a WHERE clause to every query anyway. – Marcus Adams Mar 28 '19 at 20:14
  • When using a shared schema, I figured that meant all tenants would be part of the same schema. I'm curious about what you meant by wrapper, though, what kind of wrapper do you mean? Thanks! – Regular User Mar 29 '19 at 12:53

1 Answers1

1

One problem with a shared schema model that you are proposing is your clients. I have worked with a number of companies in the past where their client agreements stipulated that their data be self-contained and not mixed with other client data, which meant that we had to go down the separate schema per client route.

You might find this is a more flexible solution too, as if any of your clients get really large, you can move them off to their own server/instance, and use a utility such as ProxySQL to direct the traffic to the relevant server/instance based on the schema name.

If you have data that is shared between all the clients, put that on its own 'primary' server, which replicates down to the client server(s). You can then replicate further by chaining replicas off the client server(s) if you need to scale out your reads.

However you decide to do it, I'd just be careful of the legal implications of combining client data in a single schema - check that you have your client permission to combine the data. Clearly if you are building something like a social network, then you will need to combine the data, but that is to be expected.

Dave Rix
  • 1,621
  • 1
  • 11
  • 17