2

I'm trying to understand all the relevant implications of the --max-obj-size value used when creating an s3ql file system. I have yet to find a complete description of the implications of this option, but have been able to piece together a few bits from the docs and discussion groups.

Mainly, I have found reasons to use larger --max-obj-size values, which leaves me wondering, why not use an arbitrarily large value (10mb? 100mb? 1gb?):

  • Smaller values mean more "inodes" are used, and worse performance from the sqlite database (as the same number of files requires more inode entries)
  • Smaller values can hurt throughput (especially for sequential reads).

From the version 1.8 changelog:

As a matter of fact, a small S3QL block size does not have any advantage over a large block size when storing lots of small files. A small block size, however, seriously degrades performance when storing larger files. This is because S3QL is effectively using a dynamic block size, and the --blocksize value merperformanceely specifies an upper limit.

So far the only advantages I have found or imagined for smaller block sizes are:

  • Less bandwidth used to re-write a portion of a file
  • Possibly better deduplication

The --min-obj-size option does not affect deduplication. Deduplication happens before blocks are grouped.

The --max-obj-size affects deduplication, since it implicitly determines the maximum size of a block.

Found here:

Can anyone offer a summary of the trade-offs one makes when selecting a larger or smaller --max-obj-size when creating a s3ql file system?

Flimzy
  • 2,454
  • 18
  • 26

0 Answers0