I have a generic question about Linux kernel's handling of file I/O. So far my understanding is that, in an ideal case, after process A reads a file, data is loaded into page cache, and if process B reads the same page before it is reclaimed, it does not need to hit the disk again.
My question is related to how the block device I/O works. Process A's read request will eventually be queued before the I/O actually happens. Now if device B's request (a bio
struct) is to be inserted into the request_queue
, before A's request is executed, elevator will consider whether to merge B's bio
into any existing request
. Now, if A and B try to read the same file offset, i.e. same device block, they are literally the same I/O, (or A and B's requests are not exactly the same but they overlap for some blocks), but so far I have not seen this case being considered in kernel code. (The only relevant thing I saw is a test on whether bio
can be glued to an existing request
contiguously.)
kernel 2.6.11
inline int elv_try_merge(struct request *__rq, struct bio *bio)
{
int ret = ELEVATOR_NO_MERGE;
/*
* we can merge and sequence is ok, check if it's possible
*/
if (elv_rq_merge_ok(__rq, bio)) {
if (__rq->sector + __rq->nr_sectors == bio->bi_sector)
ret = ELEVATOR_BACK_MERGE;
else if (__rq->sector - bio_sectors(bio) == bio->bi_sector)
ret = ELEVATOR_FRONT_MERGE;
}
return ret;
}
kernel 5.3.5
enum elv_merge elv_merge(struct request_queue *q, struct request **req,
struct bio *bio)
{
struct elevator_queue *e = q->elevator;
struct request *__rq;
...
/*
* See if our hash lookup can find a potential backmerge.
*/
__rq = elv_rqhash_find(q, bio->bi_iter.bi_sector);
...
}
struct request *elv_rqhash_find(struct request_queue *q, sector_t offset)
{
struct elevator_queue *e = q->elevator;
struct hlist_node *next;
struct request *rq;
hash_for_each_possible_safe(e->hash, rq, next, hash, offset) {
...
if (rq_hash_key(rq) == offset)
return rq;
}
return NULL;
}
#define rq_hash_key(rq) (blk_rq_pos(rq) + blk_rq_sectors(rq))
Does that mean kernel will just do two I/Os? Or (very likely) I missed something?
thanks!