I was trying to measure the system performance with sysbench fileio test. However, I'm not sure what am I playing with when I change that file-block-size
parameter.
Previously I thought it was the file system block size, but then I looked at the code and it is actually a wrapper outside the file system block size. The pseudo code of sysbench reading a file in the fileio test is as follows (mainly comes from the sb_fileio.c):
while current_pointer < file_leng:
read_leng = min(file_block_size, file_leng - current_pointer)
pread(fd, read_buf, read_leng, current_pointer)
current_pointer += read_leng
Sysbench is using pread, a syscall implemented by the file system here. When the file size is smaller than file_block_size, that parameter makes no sense as the read size will always be smaller than the file_block_size we gave it, and the actual block size used in pread (i.e. how many bytes we have to load from disk to memory even we just want to read 1 byte) is already defined by the file system (if not hardware).
For example, supposing the file system block size used by pread is 4K. When sysbench file_block_size
is 1K/2K/4K, each pread
syscall will get us a 4K/4K/4K block; when sysbench set file_block_size = 1024K
and file_size = 1024K
, each pread
syscall will get us 256*4K blocks (instead of 1*1024K block); but when file_block_size = 1024K
and file_size = 16K
, the read length sent to pread
will always be just 16K, and instead of retrieving 1024K (= 256 * 4K), it will retrieve 4*4K blocks as it is using the min(file_size, file_block_size)
and that's it.
Is my understanding right? If so, what am I actually playing with by changing that parameter? Or am I supposed to always set the file_size
to be bigger that that file_block_size
?
Also, when loading 1024K, sysbench is actually loading 256 * 4K block inside the pread
syscall, but not that 1024K as a whole - should there be any performance (throughput/latency) difference between these two behaviors?
=====
The command I used:
./sysbench --file-block-size=<file_block_size> --file-total-size=65536K --file_num=<file_num> --file-test-mode=rndrd --file-fsync-all=on --file-extra-flags=direct fileio <prepare/run/cleanup>
The file_block_size is in {1K, 4K, 16K, 256K, 1024K, etc.}, the file_num is in {1, 4, 16, ..., 65536} ==> single file size is in {65536K, 16384K, ..., 1K}. The result I get:
Latency (us) over file size (K) with different file_block_sizes
Here 16K files with 256K file_block_size
is having much lower latency than 256K files with 256K file_block_size
. That should not be the case if the file_block_size
is the load unit size of hardware, so it is not the file system block size (I have an ext2/ext3 file system with 4K block size). Then what it is?