I've done a kind of serialisation thing (mainly just to see if I could get a basic one working). Long story short, I had to remove what little compression I had, to make it so that I could read a huge file without having a problem with memory (I'm using seek
and read
together to read the file with only what I need at the time). As you can see below, what was originally a much shorter base64 encoded string is now a very long string of 1's and 0's.
input = (None, False, 'one', {2: [3, 4, 5])
#encoded = 'WCVBFA29uZUggwEBDAQGMBAgwECg'
#not encoded = '01011000001001010100000100010100000011011011110110111001100101001110000000100011000000010000000100100100000011000110000000100000001100011000000010000001000001100000001000000101'
I had a look into it but there doesn't seem to be a way to read the file as bits, so I'm kinda stuck with the current way. I'm guessing there wouldn't be an easy way to compress and read the file since I'm already reading small segments, but would I be able to define on file creation that I want windows itself to compress it?
As it's literally only storing the characters 0
and 1
, even the default compression should do wonders to it, I just don't want to have to load it all into memory initially to decompress the entire thing.