-1

I'm trying Windows overlapped IO but I can't seem to get it to work asynchronously. I've compiled and run the program below but it never prints anything, it just completes silently. I've read small reads could become synchronous, that's why I deliberately chose to read 512MB.

  const DWORD Size = 1<<29; // 512MB
  char* Buffer = (char*)malloc(Size);
  DWORD BytesRead;
  OVERLAPPED Overlapped;
  memset(&Overlapped, 0, sizeof(Overlapped));

  HANDLE File = CreateFile("BigFile", GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL|FILE_FLAG_OVERLAPPED, NULL);
  assert(File!=INVALID_HANDLE_VALUE);

  DWORD Result = ReadFileEx(File, Buffer, Size, &Overlapped, NULL); // This line takes 150ms according to the debugger
  assert(Result);

  while(!GetOverlappedResult(File, &Overlapped, &BytesRead, FALSE)) {
    printf("Waiting...\n");
  }

As additional information, I've stepped the code into the debugger, and the Overlapped.InternalHigh value gets updated (with the same value as Size) during the ReadFileEx call.

I've tried replacing malloc with VirtualAlloc, ReadFileEx with ReadFile, adding FILE_FLAG_NO_BUFFERING, and checked that the return of ReadFile was 0 and that GetLastError would be ERROR_IO_PENDING after the read. I've tried using RAMMap to see if the file was in cache but only 96KB of it was there.

I'm running Windows 10 (ver. 1703) with 8GB RAM.

Lutopia
  • 113
  • 8
  • Maybe the file is cached by the OS, so the OS decides that it can easily service the read immediately? – EOF May 13 '17 at 18:24
  • @EOF Would it seriously cache a whole 512MB+ file? Plus if it's cached, my read still takes 150ms... then what's the point of overlapped if it cannot guarantee me quick return? – Lutopia May 13 '17 at 18:32
  • 1
    this is really cache related. open file with `FILE_FLAG_OVERLAPPED|FILE_FLAG_NO_BUFFERING` and allocate `PVOID Buffer = VirtualAlloc(0, Size, MEM_COMMIT, PAGE_READWRITE)` (it must be aligned for non cached read) - and check result. and in your case better use `ReadFile` (without `Ex`) - it return to you false in this case with last error - `ERROR_IO_PENDING` – RbMm May 13 '17 at 19:30
  • I'm pretty sure it will cache a 512MB+ file if enough RAM is available. You may try to clear cache before your test using SysInternals [RAMMap](https://technet.microsoft.com/en-us/sysinternals/rammap.aspx). – zett42 May 13 '17 at 19:30
  • Weird... 150ms is way too long for queueing up an asynchronous IO request and, yes, I agree that is what should happen:( – ThingyWotsit May 13 '17 at 19:35
  • https://blogs.msdn.microsoft.com/oldnewthing/20110923-00/?p=9563 – RbMm May 13 '17 at 20:25
  • @RbMm With `FILE_FLAG_NO_BUFFERING`, `VirtualAlloc` and `ReadFile`, it takes around 140ms and still reads everything right away, its result is 0 and `GetLastError()==ERROR_IO_PENDING` – Lutopia May 13 '17 at 20:28
  • yes, this *it takes around 140ms and still reads everything right away, its result is 0 and GetLastError()==ERROR_IO_PENDING* and must be. – RbMm May 13 '17 at 20:29
  • @IInspectable Windows 10 Version 1703 and 8GB of RAM – Lutopia May 13 '17 at 20:33
  • Hmm.. looks like I've been accidentally doing the best thing for decades - (offloading to a thread or pool and firing callbacks:), just because I already had the code and could not be bothered to change to async I/O for no great gain:) – ThingyWotsit May 13 '17 at 21:22
  • add the line `if (Result) SleepEx(INFINITE, TRUE);` at the end of your code snippet ( after *while(!GetOverlappedResult..* ) and look what be :) - *Unhandled exception at 0* - because you use *NULL* as *lpCompletionRoutine* but i sure that this is windows bug – RbMm May 13 '17 at 22:21

1 Answers1

-2

Ok I've got it working, all thanks to @RbMm.

The memory allocation didn't change anything but the FILE_FLAG_NO_BUFFERING flag and use of ReadFile made it work. I tried using SleepEx after ReadFileEx and it threw an access violation at 0, which proves the point of @RbMm that the lpCompletionRoutine is not optional but mandatory. (To me that means I'm going to use ReadFile because I don't want a completion routine)

As to why it took me so long to realise what was happening: I trusted the debugger too much, obviously breaking into the debugger didn't stop the IO process, meaning memory was still being updated inside the OVERLAPPED structure, which made me think things were instantaneous. On top of that I expected ReadFile to be quick to return, but it actually takes 20ms if I attempt to read 512MB, it's far quicker when I request smaller amounts.

Thanks everyone for your suggestions :)

For completeness, here's the working program:

  const DWORD Size = 1<<20;
  char* Buffer = (char*)malloc(Size);
  DWORD BytesRead;
  OVERLAPPED Overlapped;
  memset(&Overlapped, 0, sizeof(Overlapped));

  HANDLE File = CreateFile("BigFile", GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_FLAG_NO_BUFFERING|FILE_FLAG_OVERLAPPED, NULL);
  assert(File!=INVALID_HANDLE_VALUE);
  DWORD Result = ReadFile(File, Buffer, Size, NULL, &Overlapped);
  assert(Result==FALSE && GetLastError()==ERROR_IO_PENDING);
  while(!GetOverlappedResult(File, &Overlapped, &BytesRead, FALSE)) {
    printf("Waiting...\n");
  }
Lutopia
  • 113
  • 8
  • 1
    you can use and `ReadFileEx` - no different for final result and operation time , but how i discovery just now the `lpCompletionRoutine` is mandatory parameter, not optional. it always must point to correct (can empty) routine. otherwise you got crash if will be wait after success call to `ReadFileEx` in *alertable* mode. so `ReadFileEx(File, Buffer, Size, &Overlapped, NULL);` is mistake. for demo add next code `if (Result) SleepEx(INFINITE, TRUE);` at the end. after you got `GetOverlappedResult` finished – RbMm May 13 '17 at 22:17