I have a program that currently hashes files using just SHA1. No other options. It hashes them using the SHA1 hash function that's part of the Lazarus and Free Pascal Compiler.
I've since added the ability to use MD5, SHA256 and SHA512 by using the DCPCrypt library (http://wiki.lazarus.freepascal.org/DCPcrypt or http://www.cityinthesky.co.uk/opensource). Everything is working fine, however, my earlier version hashed the file in 2Mb buffers if the file was larger than 1Mb. If it was smaller than 1Mb, it used the default buffer of 1024 bytes, like this :
if SizeOfFile > 1048576 then // if > 1Mb
begin
fileHashValue := SHA1Print(SHA1File(NameOfFileToHash, 2097152)); //2Mb buffer
end
else
fileHashValue := SHA1Print(SHA1File(NameOfFileToHash)); //1024 byte buffer
However, my hashing functions and procedures have now been moved to a single function controlled by a Radio button status to make my code more object orientated. It basically has all 4 hashing options coded within it, and which section is ran depends on which RadioButton.Checked status the program finds. The code of SHA1, for example, now looks like this :
..
SourceData := TFileStream.Create(FileToBeHashed, fmOpenRead);
..
else if SHA1RadioButton2.Checked = true then
begin
varSHA1Hash := TDCP_SHA1.Create(nil);
varSHA1Hash.Init;
varSHA1Hash.UpdateStream(SourceData, SourceData.Size); // HOW DO I ADD A BUFFER HERE?
varSHA1Hash.Final(DigestSHA1);
varSHA1Hash.Free;
for i := 0 to 19 do // 40 character output
GeneratedHash := GeneratedHash + IntToHex(DigestSHA1[i],2);
end // End of SHA1 if
My question is how do I add a buffer size to varSHA1Hash.UpdateStream if the file found is 'large' (say, bigger than 1Mb)? This is important because a 300Mb file, for example, takes 4 seconds with my earlier version and now it takes 9 seconds with my 'improved' version that utilises the DCPCrypt library! So it has doubled the time it takes for large files even though my code reads much better. If I can get varSHA1Hash.UpdateStream to read in data of several Mb at a time instead of 8k byte buffers (which the procedure UpdateStream does, if you read the code library) it will make it faster. As it stands, my understanding is that varSHA1Hash.UpdateStream(SourceData, SourceData.Size); basically reads the entire size of the file being read as the buffer?
If it helps, here is the UpdateStream procedure from
procedure TDCP_hash.UpdateStream(Stream: TStream; Size: longword);
var
Buffer: array[0..8191] of byte;
i, read: integer;
begin
dcpFillChar(Buffer, SizeOf(Buffer), 0);
for i:= 1 to (Size div Sizeof(Buffer)) do
begin
read:= Stream.Read(Buffer,Sizeof(Buffer));
Update(Buffer,read);
end;
if (Size mod Sizeof(Buffer))<> 0 then
begin
read:= Stream.Read(Buffer,Size mod Sizeof(Buffer));
Update(Buffer,read);
end;
end;
I have also looked at some other libraries, such as Delphi Encryption Compedium (http://home.netsurf.de/wolfgang.ehrhardt/crchash_en.html) and Wolfgang Ehrhardt library (http://www.torry.net/pages.php?id=519#939342) and also the one that is included with DoubleCommander, but for varios reasons (simplicty being one) I am trying to do this using DCPCrypt.