I'd like to reference the following post as well, and mention that I'm familiar with BioPython.
How to obtain random access of a gzip compressed file
I'm familiar with the Bio.bgzf
's potential for indexes and random reads. I'm building a library that uses the module to build an index against the blocks that contain data that is relevant to my interests. The technology is very interesting but I'm struggling to understand the pace of development or limitations of what Bio.bgzf
or even the bgzf standard are capable of.
Can Bio.bgzf
overwrite a specific line in the file, just as it can read from the virtual offset to the end of the line? If it could, would the new data necessarily need to be exactly the same number of bits?
After using make_virtual_offset()
to acquire a position in the .bgzf file for a line that I'd like to overwrite, I'm looking for a method like filehandle.writeline()
to replace the line in the block with some new text. If that's not possible, then is it possible to get the coordinates for the entire block and then rewrite that. And if not, it could be said that bgzf index files are sufficient for reading only. Is this correct?