I am working on a Linux kernel module that requires me to check data right before it is written to a local disk. The data to be written is fetched from a remote disk. Therefore, I know that the data from the fetch is stored in the page cache. I also know that Linux has a data structure that manages block I/O requests in-flight called the bio struct.
The bio struct contains a list of structures called bio_vecs.
struct bio_vec {
/* pointer to the physical page on which this buffer resides */
struct page *bv_page;
/* the length in bytes of this buffer */
unsigned int bv_len;
/* the byte offset within the page where the buffer resides */
unsigned int bv_offset;
};
It has a list of these because the block representation in memory may not be physically contiguous. What I want to do is grab each piece of the buffer using the list of bio_vecs and put them together as one so that I could take an MD5 hash of the block. How do I use the pointer to the page, the length of the buffer and its offset to get the raw data in the buffer? Are there already functions for this or do I have to write my own?