23

I'm working on a single database with multiple database schemas,

e.g
[Baz].[Table3],
[Foo].[Table1],
[Foo].[Table2]

I'm wondering why the tables are separated this way besides organisation and permissions.

How common is this, and are there any other benefits?

Tom H
  • 46,766
  • 14
  • 87
  • 128
Kye
  • 5,919
  • 10
  • 49
  • 84

4 Answers4

26

You have the main benefit in terms of logically groupings objects together and allowing permissions to be set at a schema level.

It does provide more complexity in programming, in that you must always know which schema you intend to get something from - or rely on the default schema of the user to be correct. Equally, you can then use this to allow the same object name in different schemas, so that the code only writes against one object, whilst the schema the user is defaulted to decides which one that is.

I wouldn't say it was that common, anecdotally most people still drop everything in the dbo schema.

Andrew
  • 26,629
  • 5
  • 63
  • 86
13

I'm not aware of any other possible reasons besides organization and permissions. Are these not good enough? :)

For the record - I always use a single schema - but then I'm creating web applications and there is also just a single user.

Update, 10 years later!

There's one more reason, actually. You can have "copies" of your schema for different purposes. For example, imagine you are creating a blog platform. People can sign up and create their own blogs. Each blog needs a table for posts, tags, images, settings etc. One way to do this is to add a column blog_id to each table and use that to differentiate between blogs. Or... you could create a new schema for each blog and fresh new tables for each of them. This has several benefits:

  • Programming is easier. You just select the approppriate schema at the beginning and then write all your queries without worrying about forgetting to add where blog_id=@currentBlog somewhere.
  • You avoid a whole class of potential bugs where a foreign key in one blog points to an object in another blog (accidental data disclosure!)
  • If you want to wipe a blog, you just drop the schema with all the tables in it. Much faster than seeking and deleting records from dozens of different tables (in the right order, none the less!)
  • Each blog's performance depends only (well, mostly anyway) on how much data there is in that blog.
  • Exporting data is easier - just dump all the objects in the schema.

There are also drawbacks, of course.

  • When you update your platform and need to perform schema changes, you need to update each blog separately. (Added yet later: This could actually be a feature! You can do "rolling udpates" where instead of updating ALL the blogs at the same time, you update them in batches, seeing if there are any bugs or complaints before updating the next batch)
  • Same about fixing corrupted data if that happens for whatever reason.
  • Statistics for all the platform together are harder to calculate

All in all, this is a pretty niche use case, but it can be handy!

Vilx-
  • 104,512
  • 87
  • 279
  • 422
  • So you create schemas dynamically in this case? – julealgon Mar 24 '21 at 17:44
  • @julealgon - Yes. Although that should obviously be a relatively rare thing to do. I'm not sure if a blog platform was the best example, although it did illustrate the point nicely. In real life I've seen this being used in accounting software where a single database could contain data for multiple companies. This is useful in the case where one actual company is split into several sub-companies "on paper", for various legal purposes. A common practice or so I understand. Since adding a new sub-company isn't a common operation at all, this approach turned out to be quite practical. – Vilx- Mar 24 '21 at 19:03
  • @julealgon - Also, it was probably the easiest way to do it when adding the "multi-company feature" to the software. :D – Vilx- Mar 24 '21 at 19:04
  • Interesting to hear that since our company currently uses a multi-tenant approach using discriminator fields on all tables. I wonder how migrations would work however, for example when you have to introduce a column using, say EntityFramework... – julealgon Mar 24 '21 at 19:22
  • 1
    @julealgon - I don't know. The accounting software I was referring to was an old one, written in Delphi, lots of legacy crap... certainly not something to draw inspiration from. But there were a _few_ successful or at least interesting ideas in there; I consider this one of them. EntityFramework and migrations though... you'll have to think outside the box here. As stated above, deployments become a lot more complicated in this kind of setup. But there are also novel opportunities, like the rolling deployments. – Vilx- Mar 24 '21 at 20:55
  • 1
    @julealgon - Another thing to consider is - well, how many tenants do you expect there to be? If you're making wordpress.com which hosts literally _thousands_ of blogs, then a discriminator field might actually be better. I have no idea how MySQL/MSSQL/Oracle/whatever would behave if it had thousands of schemas. Maybe it wouldn't care, maybe it would suffer badly. – Vilx- Mar 24 '21 at 21:01
6

To me, they can cause more problems because they break ownership chaining.

Example:

Stored procedure tom.uspFoo uses table tom.bar easily but extra rights would be needed on dick.AnotherTable. This means I have to grant select rights on dick.AnotherTable to the callers of tom.uspFoo... which exposes direct table access.

Unless I'm completely missing something...

Edit, Feb 2012

I asked a question about this: SQL Server: How to permission schemas?

The key is "same owner": so if dbo owns both dick and tom schema, then ownership chaining does apply. My previous answer was wrong.

Community
  • 1
  • 1
gbn
  • 422,506
  • 82
  • 585
  • 676
3

There can be several reasons why this is beneficial:

  • share data between several (instances of) an application. This could be the case if you have group of reference data that is shared between applications, and a group of data that is specific for the instance. Be careful not to have circular references between entities in in different schema's. Meaning don't have a foreign key from an entity in schema 1 to another entity in schema 2 AND have another foreign key from schema 2 to schema 1 in other entities.

  • data partitioning: allows for data to be stored on different servers more easily.

  • as you mentioned, access control on DB level
EJB
  • 2,383
  • 14
  • 15