We run a popular web app, and our backend file storage exists on a file server, replicated to another file server for failover with DFSR. We are reaching the theoretical limits of DFSR, and therefore need to start looking at sharding our storage.
What is the best way to go about sharding? I know we could abstract out our file storage at the application-level, but, among other things, my mind boggles at how third party controls that interact with the filesystem would be able to hook into that abstraction. What are the best techniques you've seen, or can think of? Assume for now that we have a directory structure like /Customers/bikesystems, /Customers/10degrees, etc. One big directory of customer data, with each customer having its own folder in that /Customers directory.
My initial thinking has me breaking up that larger customer directory into a more hierarchical structure, as in /Customers/b/bikesystems, /Customers/1/10degrees (taking the first letter or number of each customer ID), which gives me the ability to create DFS namespaces for each first character that makes up a customer ID (which is [a-z0-9] for you regex folks). So, a potential 36 DFS namespaces. Then from there I can shuffle those namespaces around to various servers as capacity increases for any one of them. And this would give me a lot more breathing room before I reach those theoretical DFSR limits.
Is that the best approach?
I know we could be looking at Linux or other enterprise-level storage systems (Isilon, etc.). For this discussion, however, I'd like to keep the discussion limited Windows for now. Unless, of course, you have a burning desire to extol the benefits of a different solution altogether, and you would like to help me see the light!