2

I'm designing a datawarehouse solution and I'm a newbie in disk configuration issues, let me explain you.

Our storage is spread over 6 storage enlosures having each of them 5 raid-1 disk arrays, and having 2 LUNS defined per each disk array, which makes a total 48 LUNS (this is following Microsoft fast track recommendations for datawarehouse architectures).

I would like to partition my data, on other projects I have worked before, we always followed a 1 partition - 1 filegroup rule. On the microsoft fast track recomendations it is advised to create a filegroup and then for that filegroup a data file per each lun... but I pretend to have a week level partitioning... if I apply that rule I think that I'll get too many files and a complex layout.

I'm thinking of just creating just one filegroup (with the 48 lun data files), but still create the partitions since I want to keep soem of the benefits of partitions like partition switching... Is this scenario not recommended? What would you suggest?

river0
  • 123
  • 3
  • I'm not following. Could you explain the reasoning for so many LUNs? What are some of the characteristics of your DB; read/write frequency, size, growth pattern, etc? – Chris S Feb 11 '11 at 20:12
  • the reason for so many luns... is basically because that's the recomemnded by microsoft for fast track datawarehouse architectures (to increase read performance), for size we estimate an initial 1 TB and will grow around 500 GB per year, the usage is the typical datawarehouse one, one big data load per day and a lot of querying (30 potential concurrent users) – river0 Feb 11 '11 at 20:51

1 Answers1

3

Answering this requires diving into Storage Geek. I apologize in advance.

The reason Microsoft seems to suggest 48 separate partitions is for one reason: to maximize in-OS parallelization for I/O's. By having 48 LUNs, the OS has to keep 48 separate I/O queues, and those queues can in theory be served in parallel. If one LUN is particularly slow (it's doing heavy random writes) it won't hold up access to other LUNs.

On modern hardware this is a fractional-percentage gain for a LOT of storage headache. Unless you know you will be pressing your data warehouse to the absolute upper limit, it isn't worth it. Modern RAID cards are fast enough that they can handle this for you. Having 4 LUNs could yield gains. 48 may actually hurt.

Storage these days is generally characterized by the performance metric of I/O Operations per second (I/O Ops). Each drive has its own upper limit for random I/O (ranges between 90-180 per drive, depending on RPMs and a few other things). When you gang drives together, such as in a RAID10 set, this I/O Ops count is additive. A 12-disk RAID10 set will have the same I/O Ops capacity as 6 Raid1 pairs, and doesn't force you into creating six separate DB files. By creating a single large RAID10 set you can create a single large DB file that can handle huge amounts of load.

Going back to what I said in the second paragraph about a slow LUN not holding up access to other LUNs, this is why maximizing I/O Ops for a LUN makes sense. It is far less likely to block at all if it has enough I/O Op overhead. By creating a large RAID10 array, the parallelization is pushed onto the RAID card, not the operating system, which leaves the OS free to do other things. You'll still get the parallelization advantage, and you leverage dedicated hardware for it.

For database servers it is wise to keep data-file and log-file I/O on different spindles. The exact percentage of which I'll leave to the SQL Server experts (I'm not one), and is likely based on your exact configuration and use-patterns. As it is a data-warehouse you'll need lots of log-space to handle the bulk loads. Log I/O is significantly sequential, where data I/O is significantly random, so maximal logging performance is best found by putting the logs on different spindles than the data-files are.

In your case, you may be able to get away with 2 LUNs. A big RAID10 set for your data-files, and smaller RAID10 set for your log-files.

sysadmin1138
  • 133,124
  • 18
  • 176
  • 300
  • thnaks sysadmin, really thorough and complete answer, now i see it more clear and I think we will go to a layout close to what you pointed out – river0 Feb 12 '11 at 19:31