1

We currently have an SQL Server 2000 database with one table containing data for multiple users. The data is keyed by memberid which is an integer field. The table has a clustered index on memberid.

The table is now about 200 million rows. Indexing and maintenance are becoming issues. We are debating splitting the table into one table per user model.

This would imply that we would end up with a very large number of tables potentially upto the 2,147,483,647, considering just positive values.

My questions:

  1. Does anyone have any experience with a SQL Server (2000/2005) installation with millions of tables?

  2. What are the implications of this architecture with regards to maintenance and access using Query Analyzer, Enterprise Manager etc.

  3. What are the implications to having such a large number of indexes in a database instance.

All comments are appreciated.

Thanks

Edit: I do not agreed to this question being migrated to Serverfault. This is a programming related question.

user40766
  • 161
  • 3

6 Answers6

5

A few thoughts here:

1) Don't do that. Seriously. Millions of tables would be a nightmare, and would likely cause far more problems than it solves.

2) If you really want to break the table up into multiple tables, you don't need to use that many. Depending on your hardware, I'd expect 50 million rows to be no problem, so you could divide up your data into 4 tables.

3) What I would do if at all possible, would be to upgrade to SQL Server 2005 or 2008 and use table partitioning. That would allow you to subdivide your data within a table. Not a perfect solution, but far better than millions of tables.

To answer your specific questions, I'd say that it is unlikely that SQL Server could handle that many tables in one instance and if it can having one table per record would make Query Analyzer, etc. pretty well useless.

Quick addition: from Microsoft site:

Database objects include all tables, views, stored procedures, extended stored procedures, triggers, rules, defaults, and constraints. The sum of the number of all these objects in a database cannot exceed 2,147,483,647.

http://msdn.microsoft.com/en-us/library/aa933149(SQL.80).aspx

Pretty amazing that the number is EXACTLY the one you specified... Hmmm...

jeffa00
  • 176
  • 2
  • Thanks for pointing out that 2,147,483,647 is the total number of objects in the server. I was under the impression that that is the maximum number of tables. – user40766 Apr 28 '10 at 18:15
4

Index maintenance should be done based on existing fragmentation, not blindly. With a clustered IDENTITY column, you should not have much to worry about. SQL Fool's defrag script will help.

200 million rows is not that much and not yet worth partitioning IMHO because of the query overhead, many table names requiring dynamic SQL etc. Unless you have a tiny maintenance window, perhaps

We have around 6 million rows per day INSERTed, FWIW in one table.

The pain is worse than the gain based on the info you've given.

gbn
  • 6,079
  • 1
  • 18
  • 21
2

Splitting into this many tables is a nightmare, and not at all recommended. Among other complications, think of the complexity required to add a new user--do you have to dynamically create a new table?

The answer is simply better indexing, specifically designed around the queries you are using. Since you haven't detailed those queries, I can't give you specific recommendations.

In general, though, we support many databases with tables as large as this, and yes it can be a pain, but its definitely possible.

If you do decide there to implement partitions, use a different way of dividing the data (maybe current data vs old data), and a reasonably small number of partitions. Keep in mind that if you do this "by hand" (instead of using the SQL 2005+ partitioning feature) then all queries against these partitioned tables will likely have to be re-designed.

Edit: In a specific answer to one part of your question, yes Enterprise Manager/Query Analyzer can start to do very bad things when you have massive numbers of tables. We have had poorly designed dbs with thousands of tables, and you can't even expand the "Tables" folder in the tree view without waiting a LONG TIME for it to itemize them all.

BradC
  • 2,220
  • 4
  • 27
  • 36
0

One per user seems a bit like overkill and rough on your codebase. You'd more or less have to use dynamic SQL within any stored proc that uses those tables, which definitely complicates your life and future development and testing. (I speak from experience -- we used to have some very intricate tables that we generated daily; all interactions with those tables were dynamic SQL.)

Not knowing the requirements of the apps using this data, could you age out old data into an archive or history table/tables?

For SQL 2k5/2k8 you could use partitioned tables which might help as well and abstract the multiple tables thing from your queries and apps. There are some gotchas with partitioned tables but they might work for you here.

With this kind of volume, you're going to have to do some specific prototyping and benchmarking since there's no one-size-fits-all answer.

Joe
  • 214
  • 1
  • 4
  • 10
  • Many thanks for you comments. I am aware that all queries will have to be dynamic SQL since table names would need to be referenced dynamically. Since the proc are relatively simple, this does not appear to be a major concern. Also the 2 billion is nor really expected to be hit since inactive users get purged from the system periodically. What I am concerned with is getting to the point where I am unable to even open the database in Query Analyzer due to the sheer number of tables. – user40766 Apr 28 '10 at 17:50
  • See the edit on my answer. Yes, Enterprise Manager and Query Analyzer really can start to have problems when you ask them to itemize even thousands of tables. – BradC Apr 28 '10 at 18:20
0

Sounds like Table Partitioning is the way to go. You will need at least SQL Server 2005 however.

Here is a good article on the subject to get you started Kimberly Tripp MSDN Article

etoisarobot
  • 249
  • 2
  • 4
  • 11
0

I would revisit the design.

You say it's clustered on memberid, but this could cause page splits (and fragmentation) when data is added. Better to cluster on an increasing surrogate identity (and have a unique index, perhaps even primary key, on business key which would include memberid).

Alternatively, and even if not clustering, you should have a unique index on memberid and the unique part of the remaining columns, since it sounds like you have multiple rows per member. An index on only memberid is certainly unlikely to be covering.

Cade Roux
  • 375
  • 2
  • 5
  • 18