I've been trying to teach myself Python so I don't fully understand what I'm doing. I'm embarrassed to say this but my question should be really easy to answer. I want to be able to do a CRC checksums on binary files with code similar to this:
# http://upload.wikimedia.org/wikipedia/commons/7/72/Pleiades_Spitzer_big.jpg
import zlib
buffersize = 65536
with open('Pleiades_Spitzer_big.jpg', 'rb') as afile:
buffr = afile.read(buffersize)
while len(buffr) > 0:
crcvalue = zlib.crc32(buffr)
buffr = afile.read(buffersize)
print(format(crcvalue & 0xFFFFFFFF, '08x'))
The correct result should be "a509ae4b" but my code's result is "dedf5161". I think what is happening is the checksum is being calculated on either the first or last 64kb of the file instead of the whole file.
How should the code be altered so it checks the entire file without loading the entire file into memory?
As it is, the code "works" in either Python 2.x or 3.x. If the code has to be in one or the other, I'd prefer it to be in 3.x.