0

I have sets of large numbers of files (~5000 per directory) which are significantly slowing down my file system access. I have plenty of space, and the data is important. I'd like to combine them into a single file per directory. Creating an archive would be the simple solution, but I don't want to reduce the recoverability. Some sort of flat image (e.g., an uncompressed tar-file) would work fine, but I would think there's a format that could actually be more recoverable (e.g., by storing parity information) in the same amount of space. I'm working in an mixed unix/linux/mac environment.

Is there an image/compression format that minimizes compression while providing parity-type information, or would a raw image be the maximally recoverable file format?

Bill Gross
  • 496
  • 3
  • 11
  • What is "more recoverable"? Why do you think "recoverability" will be reduced by a tar file? How so? – Mark Adler Jun 01 '14 at 08:48
  • By recoverability I mean the amount of original data that can be extracted from the data after a specific amount of damage (i.e., changed bits) – Bill Gross Jun 06 '14 at 16:30
  • I didn't mean to imply that tar would have _reduced_ recoverability -- as I understand it, it will be about the same as the original data. I'm wondering if there would be something _better_ than the original. For example, if you could compress the file to half the size, and then kept two copies of the file it would have the same disk-space "cost", but be more robust to data loss – Bill Gross Jun 06 '14 at 16:31

1 Answers1

1

You may be able to solve your performance problem simply by creating a deeper tree of subdirectories with far fewer files in each directory.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158
  • true, but all of the files "belong" to each other -- there's no deeper logical subdivision possible. I could arbitrarily divide them up, but I'd rather tie them all together in a single file – Bill Gross Jun 01 '14 at 19:12
  • 2
    The division does not have to be "logical". Arbitrary is fine. I've see lots applications that do that where there are many files, e.g. mail programs. – Mark Adler Jun 02 '14 at 07:57