2

My disk are 10x1TB SAS 7200 RPM in a RAID 10 with a MegaRaid 9260 hardware controller with cache/BBU. This results in a 4.6TB RAID 10 volume. hdparm -t (when device is empty) results in 500MB/s.

RAID chunk size is 64KB, filesystem block size is 2KB (I'm going to change it to the minimum chunk size and 4KB block size).

The directory pattern is /data/x/yz/zyxabc.gz

I'm using EXT4 with plans to move to XFS. The OS is RHEL 6.


As of now, it works great. The workload is 99% reads and it can read up to 300 files/second under normal conditions. The problem is backups. It takes 6 days to backup with scp. rsync is even slower. DD goes at about 2MB/s. LVM snapshots could be an option if I take the snapshot, back it up, and then delete it. Data consistency is very important to me.

Files are about 0.5-4KB each. Would I see increased backup performance if I stored all of the files in a database instead? What other alternatives are there for me to tackle the problem of backing up this many small files in a reasonable window?

MDMarra
  • 100,734
  • 32
  • 197
  • 329
cedivad
  • 690
  • 3
  • 13
  • 25

4 Answers4

3

Have you considered solutions like AMANDA or Bacula?

2

i plan to move to XFS

You'd better pre-order tons of Prozac in that case. :-) XFS sucks a lot on that pattern (lots of tiny files), alas.

If you're considering FS change Reiser3 is the only option worth of trying in that case, IMO. With notail you get less CPU overhead, w/o notail — less disk space overhead.

RAID chunk of 64 K is also beyond of sanity — why overflow disk I/O queues with such tiny patterns? Increase it instead of decreasing! With lots of simultaneous I/O it won't hurt.

Now when it comes to backing up, it's possible to mention COW FSes. Such as Btrfs, or Nilfs. LVM-2 snapshots possibly are ok as well, so you can try combine it with migration to Reiser3. But I guess COW FSes have more chances to give you what's needed.

Iterator
  • 135
  • 6
poige
  • 9,448
  • 2
  • 25
  • 52
  • I see 2 ways to fix my problem: (1) - disk remotely synch with one on a remote backup server, on that server i take backups using snapshots. So the overhead on the root server is minimal. (2), using XFS under OpenSolaris. I know NOTHING about solaris, but it seems to be stable as a rock. I have working snapshot to take backup and another important feature, i could use a 500GB SSD drive i have as pool cache. So half of the files would load in no time (200 million files = 1TB). Or maybe, a mix of the 2 options: XFS on Solaris used for caching, with the data live on the remove server to backup. – cedivad Nov 09 '11 at 13:13
  • 2
    @cedivad, I think you're messing up XFS and ZFS. Don't. :) – poige Nov 09 '11 at 15:42
  • Yes, sorry =) I will use zfs under solaris ;) – cedivad Nov 09 '11 at 16:02
  • With lots of small files, wouldn't he want to DECREASE srtipe size? – Bigbio2002 Nov 10 '11 at 18:32
  • @Bigbio2002, you might be, but not me — small stripe size means always involving several disks even for relatively small size I/O. You can get increased bandwidth for single operation, but more busy disks for simultaneous I/O threads. That's more crappy, specially nowadays when throughput of typical SATA disk is quite satisfactory for serving single file request by itself. – poige Nov 11 '11 at 09:32
  • Interesting, didn't know that. RAID optimization is a complex topic. – Bigbio2002 Nov 11 '11 at 16:00
  • @Bigbio2002, it's just a matter of logical thinking mostly. People often tend to "cache" knowledge/logical reasoning and don't relize that cache data became obsoleted. – poige Nov 11 '11 at 19:02
  • XFS has made huge progress in the past years. I successfully tested a filesystem with 1 billion files, without any problem. Recent options like "lazy-count" and "inode64" allow much better journal management. – wazoox Nov 17 '11 at 17:51
  • 1
    @wazoox, those options are nothing comparing to long-awaited `delaylog` but its stability is still be under question: http://comments.gmane.org/gmane.comp.file-systems.xfs.general/35886 Also, it's unclear what you tag as problem since waiting 4 hours instead of 1 could be no problem for some people as well. – poige Nov 18 '11 at 01:54
  • @poige, as you can see I've posted in the thread you pointed to, but this was a year ago and most probably a 3Ware related problem more than an XFS bug, AFAIK. – wazoox Nov 18 '11 at 11:53
  • @wazoox, nope I haven't noticed your posting there. – poige Nov 18 '11 at 12:07
1

Either use a backup solution that supports incremental backups, such as those already mentioned, or perhaps can you use a script that traverses the tree and only copies files with a certain modify time?

I'm not sure what you mean by "I need consistency" though. Do you mean all files need to be backed up at the same point in time (i.e. snapshot)? In that case I'm not sure any sort of tar, copy, rsync or similar will work - you'll HAVE to use something that can create file-system snapshots, or pause whatever process is creating these files in the first place.

Cylindric
  • 1,127
  • 5
  • 24
  • 45
0

"DD goes at about 2MB/s"

I'm confused, doesn't dd do a sequential (or attempt to) read of the device? Is it competing with the online use of these files? If that's the case I think more disks/faster disks are in order. 1TB SAS is still 7,200 RPM if I'm not mistaken, you can pick up 600GB 15K SAS which will cut your seeks drastically.

Are you dumping it to a RAMDisk? So that your destination location can't be the bottleneck of the DD test (and you're not dumping it right back to the local disk, again causing high seeks).

If 2MB/s the best you're going to get out of the fastest possible read pattern, you need faster disks.

However, dd wont get you a consistent snapshot without combining it with something else.

StrangeWill
  • 541
  • 5
  • 16