7

I am profiling binary data which has

  • increasing Unix block size (one got from stat > Blocks) when the number of events are increased as in the following figure
  • but the byte distance between events stay constant
  • I have noticed some changes in other fields of the file which may explain the increasing Unix block size

enter image description here

The unix block size is a dynamic measure. I am interested in why it is increasing with bigger memory units in some systems. I have had an idea that it should be constant. I used different environments to provide the stat output:

  • Debian Linux 8.1 with its default stat
  • OSX 10.8.5 with Xcode 6 and its default stat

Greybeard's comment may have the answer to the blocks behaviour:

The stat (1) command used to be a thin CLI to the stat (2) system call, which used to transfer relevant parts of a file's inode. Pretty early on, the meaning of the st_blksize member of the C struct returned by stat (2) was changed to "preferred" blocksize for efficient file system I/O, which carries well to file systems with mixed block sizes or non-block oriented allocation.

How can you measure the block size in case (1) and (2) separately?

Why can the Unix block size increase with bigger memory size?

Léo Léopold Hertz 준영
  • 134,464
  • 179
  • 445
  • 697
  • 3
    Can you elaborate about what the block size refers to and how you're measuring memory usage (and what specific memory you're measuring?) – templatetypedef Aug 20 '15 at 17:02
  • @templatetypedef Those details are from the command `stat`. The blocks is B, the same as here https://en.wikipedia.org/wiki/B-tree. I extended the body. – Léo Léopold Hertz 준영 Aug 20 '15 at 17:43
  • 1
    You need to be much clearer about the context. What algorithm? What are events? Size of what? What are the blocks (how do they refer to the B-tree)? What is wrong with your complexities? Also, isn't this O(size) space complexity? – Nico Schertler Aug 21 '15 at 07:59
  • The `stat (1)` command used to be a thin CLI to the `stat (2)` system call, which used to transfer relevant parts of a file's `inode`. Pretty early on, the meaning of the st_blksize member of the `C struct` returned by `stat (2)` was changed to `"preferred" blocksize for efficient file system I/O`, which carries well to file systems with mixed block sizes or non-block oriented allocation. Can you explain what `B` you are referring to and why this is tagged `algorithm`; tell which distribution(s)/implementation(s) you used and provide sample output(s)? (just noticed rev.1 - where's `B`?) – greybeard Sep 04 '15 at 05:55
  • @greybeard Algorithm tag because the internal function of the algorithm is hoped to be studied here. `stat` has a dynamic behaviour. Complexities because this event may be related to my system other challenges. I added other answers to the body of the question. – Léo Léopold Hertz 준영 Sep 04 '15 at 07:58
  • Things still escaping me: What is an `event`? What is `block size`: size of a single block reported by `stat (1)`, or number of blocks reported allocated (which should be expected to increase as more bytes/blocks are written)? (In the beginning, there were (as needed) "direct blocks" (block number in inode), an indirect block (full of block #s of further blocks), a double-indirect block, … In effect, #blocks increased with n log(n) for n bytes written, with very low constant factors. – greybeard Sep 04 '15 at 12:11
  • Event is abstract entity. As number of events increases, block soze increases. Block size is the tangent of the figure i.e. positive, increasing. – Léo Léopold Hertz 준영 Sep 04 '15 at 12:38

1 Answers1

3

"Stat blocks" is not a block size. It is number of blocks the file consists of. It is obvious that number of blocks is proportional to size. Size of block is constant for most file systems (if not all).

Konstantin Svintsov
  • 1,607
  • 10
  • 25
  • Thank you for your answer! Yes, I look in hexa-editor and see that the byte difference is fixed between events in the file, indicating the constant block size also. So the number of blocks is increasing because the file size linearly. In which kind of space-complexities such a linear increasing trend can be seen between the number of blocks and the file size? How can you change/improve such a condition? When should you change it? – Léo Léopold Hertz 준영 Sep 10 '15 at 15:01
  • 1
    file is stored in number of blocks of some fixed size (size depends on file system). So number of blocks is filesize divided by block size rounded up. – Konstantin Svintsov Sep 10 '15 at 15:37