1

I've got a server with 40TB of data and 35 million files. While the server itself has raid and all that jazz, I'm concerned about what would happen should something physically destroy the server (fire, lightning, etc).

The system is designed to run everything off a single drive (reams of legacy code) so segmenting into little drives isn't an option (the resources required to rewrite the code are prohibitive).

I was wondering what cost effective offsite backup options there are. Moving them to a hosted solution is on the table, but since a lot of this is media data that needs to be frequently processed by a server farm that may introduce latency and bandwidth issues.

edit: by "single drive" I mean from the user's perspective. The data itself may be distributed in one drive so long as the software that accesses this data can treat everything as a singe drive.

Jeff
  • 21
  • 1
  • Our FAQ, which you skipped, states right at the top that we don't do "Product, service, or learning material recommendations". – Chopper3 Sep 27 '12 at 15:48
  • @Chopper3: darn – Jeff Sep 27 '12 at 15:52
  • @Chopper3 This is a rather specific question. Do you have any idea where I can find the proper experts to ask? I'm looking for someone who will help with a technical answer as much as a cost factor one. – Jeff Sep 27 '12 at 16:01
  • Geting a hosted/managed service provider to back up 40TB of data for you is probably not going to be cheap :-) If you can give us some of the requirements info we do have some storage guys around here who might weigh in with implementation ideas... – voretaq7 Sep 27 '12 at 16:22

1 Answers1

3

Like Chopper said, we don't do product and service recommendations - so everything I'm going to tell you is pretty generic. You're going to have to go to your thinking closet (possibly your crying corner after seeing some of the price tags) and figure out how to make this happen in your environment.


If you're asking "How do I back up this server?", which is what it sounds like you're trying to figure out, start by defining your requirements:

  • How much data?
  • How often do you need to back it up?
  • How frequently does it change? (How much will be in each backup cycle?) etc.

Knowing that will help you figure out which solution might be most effective. You can probably do the backup itself with many commercial (or open-source) backup tools, but picking the media and backup schedule will vary.

  • Lots of high-volume changes
    You're probably a candidate for a SAN and SAN replication (or equivalent with a NAS). Basically clone the whole filesystem to another site and stream the changes over a dedicated network link.
    Note this isn't a real "backup" - if someone deletes an important file it disappears from both places - it just gives you Disaster Recovery.
  • A big initial set, but low change volume
    A single base backup and incremental backups thereafter (possibly with "consolidation" or "synthetic full" backups to keep restore time under control) could work well here.

Picking the media type is also important - Tapes are traditional, and pretty big.
If the cost of tapes is prohibitive terabyte disk drives are relatively cheap too, and reliable. Optical media is probably not an option with your 40TB data set size.

SAN replication over a dedicated link is a great solution aside from cost. Another option is backing up to a SAN (or just a host with a ton of disk) at a your remote site, but this would also likely require a dedicated (FAST) network connection. This is especially true if your change set is large (if you have a small change set you can always do your initial backup locally and then ship the storage device off-site and make do with a slower link).


In addition to backing up make sure you take into account the restore process -- If it takes you 3 months to get your data back over a slow network uplink the backup may not be very useful...

voretaq7
  • 79,879
  • 17
  • 130
  • 214