0

I am trying to chunk a file and create SHA1 finger prints for those chunks. Following is the code. My file is just having one line "tests" when I generate finger print from python command prompt I get different answer.

>>> m = hashlib.sha1()
>>> m.update("tests")
>>> m.digest()
'\x04\xd1?\xd0\xaao\x01\x97\xcf,\x99\x90\x19\xa6\x07\xc3l\x81\xeb\x9f'
>>> 
>>> m.hexdigest()
'04d13fd0aa6f0197cf2c999019a607c36c81eb9f'
-------------------------------------------------------------   

    import sys, os,hashlib
    kilobytes = 1024
    megabytes = kilobytes * 1000
    chunksize = int(1.4 * megabytes)                   # default: roughly a floppy
    hash = hashlib.sha1()
    def split(fromfile, todir, chunksize=chunksize):
        if not os.path.exists(todir):                  # caller handles errors
            os.mkdir(todir)                            # make dir, read/write parts
        else:
            print " path exists"
          ##for fname in os.listdir(todir):            # delete any existing files
          ## os.remove(os.path.join(todir, fname))
        partnum = 0
        input = open(fromfile, 'rb')                   # use binary mode on Windows
        while 1:                                       # eof=empty string from read
            chunk = input.read(chunksize)              # get next part <= chunksize
            print "chunk=",chunk,"\n"
            if not chunk: break
            partnum  = partnum+1
            filename = os.path.join(todir, ('hashpart%04d' % partnum))
            fileobj  = open(filename, 'wb')
            print "chunk before hashin is =",chunk, "of type ",type(chunk)
            hash.update(chunk)
            print "finger print is ",hash.hexdigest()
            fileobj.write(hash.digest())
            fileobj.close()
        input.close(  )
        assert partnum <= 9999                         # join sort fails if 5 digits
        return partnum

-----------------------------------------------------------

but with the code written above it gives me

chunk before hashin is = tests
of type  <type 'str'>
finger print is  64853233b4bd86fc53565e1383f2b19b6ede2995

Can someone help me?

1 Answers1

0

Your file is "tests\n". Thats why your print statement spanned multiple lines. Add the new line and you get the right hash.

>>> m.update("tests\n")
>>> m.hexdigest()
'64853233b4bd86fc53565e1383f2b19b6ede2995'
tdelaney
  • 73,364
  • 6
  • 83
  • 116