I have a situation where I need to read arbitrarily-sized (but generally small) chunks of binary data from an SQlite database. The database lives on disk, and the data is stored in rows consisting of an id and a read-only blob of between 256 to 64k bytes (the length will always be a power of 2). I use the SQlite incremental I/O to read the chunks into a rewritable buffer, then take the average of the values in the chunk, and cache the result.
The problem I have is that since the chunks are of arbitrary size the blob size will only very occasionally be an integer multiple of the chunk size. This means that a chunk will span two blobs quite frequently.
What I am looking for is a simple and elegant (since 'elegance is not optional') way to handle this slightly awkward scenario. I have a read-chunk
function which is fairly dumb, simply reading the chunks and computing averages. So far I have tried the following strategies:
- Read only the first part of an overlapping chunk, discarding the second.
- Make
read-chunk
aware of blob boundaries, so that it can move to the next blob where appropriate. - Use something like a ring buffer, so that overlapping chunks can just wrap around the edges.
The first option is the simplest but is unsatisfactory because it discards potentially important information. Since read-chunk
is called frequently I don't want to overburden it with too much branching logic, so the second option also isn't appealing. Using a ring buffer (or something like it) seems like an elegant solution. What I envisage is a producer which reads intermediately-sized (say, 256 byte) chunks from the blob into a 1k buffer, then a consumer which calls read-chunk
on the buffer, wrapping around where appropriate. Since I will always be dealing with powers of 2 the producer will always align to the edges of the buffer, and I can also avoid using mod
to compute the indices for both producer and consumer.
I am using Lisp (CL), but since this seems to be a general algorithmic or data structure question I have left it language-agnostic. What I am interested in is in clarifying what options I have - is there another option other than the ones I've listed?