I have a 16GB file that I read sequentially from the HD in 4KB blocks and for which I want to calculate the reading time if I read it:
- one block at a time,
- one block every 2,
- one block every 4,
- ...
- one block every 512
my code looks like:
...
constexpr size_t B = 1 << 12; //4KB block size
constexpr size_t j = 1; // 1, 2, 4, ..., 512
std::ifstream f(filename);
size_t M = N / (j * B); // # of actual blocks to read
auto t1 = std::chrono::high_resolution_clock::now();
for (int i = 0; i < M; i++){
f.seekg(i * j * B, f.beg);
f.read(buff, B);
}
auto t2 = std::chrono::high_resolution_clock::now();
auto elaps = std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1).count();
f.close();
...
When I go to measure the times, however, I have this strange behavior:
j M % of f time (ms)
1 4194304 100,0% 79815,3
2 2097152 50,0% 80141,9
4 1048576 25,0% 79963,0
8 524288 12,5% 79721,7
16 262144 6,3% 79974,9
32 131072 3,1% 80374,9
64 65536 1,6% 80708,3
128 32768 0,8% 80674,9
256 16384 0,4% 80423,3
512 8192 0,2% 17308,4
j
is the 'jump' i read a block of 4KB every j
blocks, M
is the total number of blocks to read for complete the file of 16GB.
The file is a binary file containing randomly generated bytes.
What's going on?
- Why is the total time constant even if the total bytes to be read decrease?
- Can the seek be ignored and the file read without skipping?
- What happens for
j = 512
?