2

Years back, I had created a small system against a requirement where a snapped image from Android was uploaded onto a server along with its respective custom data and then stored on the disk and the custom data describing the image was further broken up and stored in the database. Each of the snapped images was actually a part of a campaign. Over the period, the system went on growing enough and now there are now over 10,000 campaigns already and over 500-1000 images per campaign. Though, the performance is not all that bad however I believe its just a matter of time. We now are thinking of archiving the past campaigns in another database called as Archive. Now here is what I am planning to do.

  1. The Archive Database will have the exact same structure and the Archive functionality may have a search mechanism however, retrieval speed is not much of a concern here as this will happen very rarely.
  2. I was thinking of removing records from one database and cloning it in the other, however the identity column probably will not let me do that very seamlessly. (and I may be wrong too.)
  3. There needs to be a restore option too. (This is probably the most challenging part)
  4. If I just make the records blank(except for the identity) from the original database and copy it to the other with no identity constraint, probably it is not going to help and I think it will loose the purpose of the exercise.

Any advise over this? Is there any known strategy or pattern or literature or even a link that may guide me on this?

Thank you in advance for your help.

Maggi
  • 109
  • 2
  • 8

2 Answers2

1

I think good solution for you case is SSIS. This technology can provide fast loading of big volume of data to you Archive system. In addition you can use table partitioning to increase performance of manipulation of big data in Archive system. Also check such thing like comumnstore indexes (but it depends on version of SQL server). I created such solution with following steps: 1) switch partition from main table t to another table t_1(the oldest rows in a table) in production system 2) load data to Archive system from table t_1 3) drop or truncate table t_1

Vladimir Semashkin
  • 1,270
  • 1
  • 10
  • 21
1

I say: as long as you don't run out of space on your server, leave it as it is.

Over the period, the system went on growing enough and now there are now over 10,000 campaigns already and over 500-1000 images per campaign.

→ That's 5-10 millions of rows (created over several years).
For SQL Server, that's not that much.
Yes, I know...we're talking about image files stored in the database, not "regular" rows. Still, if your server has reasonably sized hardware, it shouldn't really matter.

I'm talking from experience here - at work, we have a SQL Server database which we use to store PDF files and images.
In our case, we're using a "regular" image column - since you're using SQL Server 2008, you could even use FILESTREAM (maybe you already do, but I don't know - you didn't say anything how exactly you're storing the image in the database).
We started the project on SQL Server 2005, where FILESTREAM wasn't available yet. In the meantime, we upgraded to SQL Server 2012, but never changed the data type in the table where we're storing the files.


If you still prefer creating a separate archive database and moving old data there, one piece of advice concerning this:

2) I was thinking of removing records from one database and cloning it in the other, however the identity column probably will not let me do that very seamlessly. (and I may be wrong too.)

[...]

4) If I just make the records blank(except for the identity) from the original database and copy it to the other with no identity constraint, probably it is not going to help and I think it will loose the purpose of the exercise.

You don't need to set the column to identity in the archive database as well.
Just leave everything as it is in the main database, but remove the identity setting from the primary key in the archive database.
The archive database doesn't ever need to generate new keys (hence no need for identity), you're just copying rows with already existing keys from the main database.

Community
  • 1
  • 1
Christian Specht
  • 35,843
  • 15
  • 128
  • 182