2

There is a project for lab. The requirement to save the lab data (raw data and related infomation ) for research for years. So the data must be saved for years, but the data is so big, every raw data is more than 10 GB. If we store the raw data in SQL Server's filetable and store related information in a normal SQL Server table, after months or years, the database's size is so big than we must move some history data out of database.

Maybe we could use adding files (in different hard disk) to the filestream filegroup for filetable (store raw data ), but I feel that's not suitable for maintaining SQL Server. We must keep the raw data and relation information's consistency.

In fact we have considered to use tape to save the history raw data. We want to use hard disk to save latest raw data and use tape to save the history raw data. When we move the history data to tape we record the move log in a table, so that we can know where the history data is moved and pick up it as soon as possible.

Is there a good advice for me:

  • How to store the raw data and related information ?
  • Is filetable suitable for the scenario?
  • Is there another good solution for this scenario?
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
huoxudong125
  • 1,966
  • 2
  • 26
  • 42
  • I'd say, store the meta-data in the database and the file itself "somewhere else". Where that is depends on how, when and how often you want to use it. I shouldn't save the files in the database, but you came by that answer yourself already. – Alfons Oct 28 '13 at 12:17
  • One requirement detail:we have a machine(Xradia MicroXCT) to scan the sample(such as small rocks),in the processing we will input some parameters for the MicroXCT's control system.Because of the parameters are important for MicroXCT's result,so we must save the parameters and the MicroXCT's result(raw data) which size is bigger than 10 GB._The raw data will be processed in next days or months or years,so we must keep the raw data and the parameters for long time._ – huoxudong125 Oct 29 '13 at 04:25
  • Is the raw data a single file or a series of files that create the entire dataset? – Dan Snell Nov 03 '13 at 01:40

1 Answers1

1

There is not a simple answer. Even Microsoft will not reply you with a simple answer, see here: http://msdn.microsoft.com/en-us/library/hh403405.aspx

I cannot reply on filetables. Just remember you need sql 2012.

Based on my experience (as DBA):
- store on filesystem is better for performances and cost: you can use different storage type easly, different machines
- store on filesystem is better for backup: you can deduplicate, manage better compression and so on.

The real drawback is, if you store in filesystem, you lose transactionality.

I had a similar scenario, with files up to some GB:
I used a correlation table to specify some metadata and where the file is.

user_0
  • 3,173
  • 20
  • 33