I'm trying to figure out a way to improve a C++ Win32 program I've made which basically recursively traverses a given folder and for every given file computes an hash ( let's say MD5, but it could be any sort of CPU expensive computation ). Since this is an I/O bound application, most of the time the process is waiting for I/O to finish, hence not using as much CPU as it could. Even doing this using a thread pool would probably (am I wrong?) not solve the issue, every thread would block waiting for I/O to complete, plus there would be the context switching overhead.
So I'm starting to consider doing this using overlapped reads, every time I'd collect a new file to process I would enqueue a non blocking read operation to a queue, having one thread processing completion callbacks and block-hashing every chunk I receive from the queue itself ... theoretically this should avoid the process hanging on I/O wait and I should notice a CPU usage increase, thus an overall speedup.
I have the following questions:
- I am assuming this will increase the overall performances of the application, am I right ? If not, why ?
- Are I/O completions events guardanteed to be ordered the same way as reads operations? I mean, if I read N bytes from offsets A, B and C of a file, will I get the completion events of A, B and C in that order or could they arrive on a unpredictable order ?
- I'm searching for a library or some code samples to implement this whole mechanism, should I use an IOCP, or simply RegisterWaitForSingleObject with a custom callbacks ? I do not seem to find examples for multiple files I/O, everything I find is just an example of overlapped reads on a single file, or an IOCP with sockets, can you point me to the right direction ?
- Wouldn't a thread pool be useless in this case? A single thread approach should be good enough ( following nginx/libevent approaches for instance ), right ?
Please do not answer something with alternative solutions, I just want to implement an OVERLAPPED operations queue the best way I can, I'm not interested in anything else (unless proven to be more efficient for my scenario of course).
EDIT:
What the current implementation of the software is ( of course the app is not exactly like this, just to give an idea ):
DWORD crc32( PBYTE data, DWORD size )
{
// compute the crc32 of the data and return it
}
void on_file_callback( const char *pszFileName )
{
PBYTE file_map = ...; // Open the file and memory map it.
if( crc32( file_map, file_size ) == 0xDEADBEEF )
{
printf( "OMG!!!\n" );
}
// Cleanup
}
int main( int argc, char **argv )
{
const char *pszFolder = "c:\\";
// recurse pszFolder and call 'on_file_callback' on every file found
recurse_directory( pszFolder, on_file_callback );
}
Thanks.