Is it possible to force an asynchronous read operation initiated by FileStream.BeginRead to finish early, without errors?

Question

First of all, the documentation for EndRead does NOT explicitly say that an asynchronous read operation initiated by BeginRead is atomic or uninterruptable.

The Question

Is it possible to interrupt an asynchronous read operation started by FileStream.BeginRead, so that it finishes before filling the buffer, returning the number of bytes read into the buffer so far, or is it an all or nothing operation?

In other words, is there some method like "Cancel_IO", that I can call, such that when I call EndRead, instead of waiting for all possible bytes to be read, it returns earlier as as result of the read being cancelled?

Background

I've read the documentation of FileStream, BeginRead, and EndRead. EndRead does not have any overloads that are capable of triggering premature completion of the operation, returning a partially full buffer. I'm interested in whether anyone can confirm or deny the existence of a method in the Windows Operating System's API (Win32), or perhaps of a disk driver API, that could cause an operation initiated by FileStream.BeginRead to finish early when EndRead is called. By "early", I mean before filling the entire requested buffer length, without an error.

Use Case

For the sake of the unimaginative, assume the file is on a network share, and the network may sometimes experience extreme slow-downs, such that triggering the early completion of a generic 1MB buffering operation would be practical and optimal, in order to retrieve a few bytes for processing before resuming a new 1MB buffering operation.

Those "few bytes", could be used to initiate the construction of a number of computationally-intensive in-memory resources, which could be constructed while the buffering is allowed to finish.

About the Documentation

Note that the documentation of BeginRead does not explicitly state that the asynchronous operation is atomic or is uninterruptable. All it mentions is that if an "error" occurs, you won't know about it until EndRead is called. This does not preclude the possibility that some other event, which is not an error, could occur that would cause EndRead to return some number of bytes less than the number requested, which it does all the time anyway.

For example, "end of file" and "buffer full" can be though of as the two "natural" interruptions of an asynchronous read operation, which cause it to return less than the number of bytes requested, without error. I'm looking for "artificial" interruption possibilities, which would also cause EndRead to successfully return the number of bytes read into the buffer, before the EOF and before the buffer is full.

How many bytes are you reading at a time that it would be worthwhile to stop before they are all read? Or are you reading from a remote filesystem? — John Saunders, Aug 29 '11 at 02:40
The question is theoretical. Assume it is worthwhile to interrupt the read operation. The number of bytes I'm reading is irrelevant; it's the immediate need for any data that may have been read so far that is important. For the sake of argument, assume a 1MB read operation is active, but we could REALLY use the first 8 bytes to go do something else that takes some time, which would be useful to do while the read is taking place, rather than wait on it to finish filling the buffer completely. — Triynko, Aug 29 '11 at 03:03
I understand that there are multiple layers of buffering, both in the .NET framework and the file system. I also know how to open a file handle for a FileStream to bypass these buffering layers and read data directly into a byte array. I'm strictly concerned with whether I can interrupt a BeginRead operation and return the number of bytes read so far, or if it's an atomic operation as far as the framework and/or underlying file system is concerned. — Triynko, Aug 29 '11 at 03:09
The reason the number of bytes I'm reading is irrelevant, theoretically, is because whether I'm reading a kilobyte or a gigabyte, the source of the read could slow unexpectedly (depending on the underlying hardware and OS), turning a kilobyte read into an hour-long operation. The point is... SOME read operation is taking place, which hasn't completed, and may or may not complete anytime soon, AND we'd like to get anything read so far NOW. — Triynko, Aug 29 '11 at 03:16
Would you like to get _anything_? What about a single byte? Surely, what you want to do is get _eight_ bytes, then go do something with it. — John Saunders, Aug 29 '11 at 03:34
Yes, anything. One byte would be the minimum possible, obviously, since that's the smallest addressable chunk in Windows. I may want more than one byte, it would depend on the data stream. An asynchronous read may have been issued, just before processing of read data started. Partway through processing, it is determined that X additional bytes >= 1 would be more useful immediately rather than whenever BUFFER_SIZE bytes have been read by the pending asynchronous read operation. That is the scenario. — Triynko, Aug 29 '11 at 04:16

score 2 · Answer 1 · answered Aug 29 '11 at 03:41

2

Documentation explicitly says that: EndRead must be called with this IAsyncResult to find out how many bytes were read. On the other hand EndRead is blocking thread until read operation is completed. So, seems like read operation is atomic.

This is logical to me, since your scenario have a little of practical usage. If valuable information is stored in part of file being read, then you can always read it in smaller portions.

answered Aug 29 '11 at 03:41

Petr Abdulin

33,883
9
62
96

Thanks, but I already read the documentation. I know EndRead is blocking. I gathered that logically the operation is atomic. That's why I asked the question to make sure. But all you've done here is make unimaginative assumptions about the existence of practical applications, as well as assumptions about the non-existence of methods on the FileStream class or within the operating system's APIs that could cause an immediate subsequent call to EndRead to return prematurely and successfully without an error. – Triynko Aug 29 '11 at 04:00
And yes, I could read in smaller portions, but that would require me to know in advance whether that's necessary, which is not necessarily the case depending on the contents of the data read so far which is currently being processed. By the time processing would get to a point where it would know it needs only 8 bytes, it would have already initiated a buffering operation. What I'm interested in, is whether that buffering operation can be interrupted without error, by calling some other method in the framework or the OS API. – Triynko Aug 29 '11 at 04:23
@Triynko If you have read the documentation, and aware of blocking nature of EndRead, then this should be specifically noted in your question. – Petr Abdulin Aug 29 '11 at 04:25
The reason this is necessary, is because the buffering component is a generic component that is passed a file name, and attempts to fill a buffer in the background so that it's available when GetBuffer is called. When GetBuffer is called, the buffer is returned, and the component attempts to fill the next buffer. There are two buffers in a swap chain. The issue is, suppose after calling GetBuffer and initiating the next read request, processing of the retrieved buffer finishes, and determines that calling a method like GetPartialBuffer or PeekBuffer would be more efficient. – Triynko Aug 29 '11 at 04:27
@ Petr: Whether I've read the documentation of (Begin/End)Read is irrelevant in this scenario, because the documentation does not answer my question, as I've explained. After quoting documentation, the answer begins "So, it seems like...", which is speculation. It is then followed with "logical to me", which is opinionated, followed by "your scenario have a little of practical usage", which is very presumptuous and unimaginative, since I already have a practical scenario in mind. Finally, saying "you can always read it in smaller portions" is false, because the need is not foreknown. – Triynko Aug 29 '11 at 04:36
@Triynko it is relevant to question, since the question is of open type (It is possible.. or..), so this information could be valuable to enquirer. It's absolutely possible my answer is not relevant, nor useful to you. – Petr Abdulin Aug 29 '11 at 04:45
@Triynko Most of your comments should be included to the question, it will be much more clear then. – Petr Abdulin Aug 29 '11 at 04:47
@ Petr: Cool. Done. See edited question for relevant updates. – Triynko Aug 29 '11 at 04:50

Triynko · Answer 2 · 2011-08-29T06:15:40.760

I read something in the documentation for windows synchronous and asynchronous I/O that may do the trick, but it would be a trick with uncertain consequences.

"If the handle is deallocated prematurely, ReadFile or WriteFile may incorrectly report that the I/O operation is complete."

Since the .NET BeginRead method is ultimately based on the Win32 ReadFile method, then acquiring and prematurely deallocating the handle may accomplish what I'm trying to do. The consequences of this will need researched.

It also mentions "To cancel all pending asynchronous I/O operations, use either: CancelIo or CancelIoEx", but those appears to cancel entire operations with failure (ERROR_OPERATION_ABORTED). I'm not sure whether any bytes read would have already been written to the buffer, and even if they were, one would not know how many were successfully written. I wonder if there's a way to trick the underlying system into thinking it has suddenly reached the end of the file or stream...

I also see that the "SetCommTimeouts" method has some interesting results, in particular surrounding the "COMMTIMEOUTS Structure's" "ReadIntervalTimeout" member, which claims that:

"If the interval between the arrival of any two bytes exceeds this amount, the ReadFile operation is completed and any buffered data is returned."

That seems promising...

In any case, the mere fact that I can cancel pending asynchronous I/O is useful. I could actually compute, using buffering stats and start times, whether it would be worth it to cancel the asynchronous read and re-issue a smaller read of the desired data chunk or whether it would be better to just wait for the operation to complete. It would depend on the calculated average speed of the stream, and how close it is to completion (based on it's computed/predicted progress value) and expected completion time, weighted against the relative utility of obtaining the desired data chunk.

score 0 · Answer 3 · answered Sep 02 '11 at 14:09

No; there is no API that can do this.

Closing the handle is an old trick - it works way back to the NT days. However, it causes all outstanding operations on the handle to complete with an error.

Cancelling has similar problems - and note that CancelIoEx may not be available on your platform.

SetCommTimeouts is not an option, since it only works on serial ports and other communication device handles.

Cancellation has historically been one of the most difficult parts of writing a device driver. These days, the kernel has Cancel-Safe Queue support built-in to XP (back-portable to 2K), and it's a lot easier. But a lot of drivers (especially older drivers) just ignore cancellation anyway (which is legal).

I recommend implementing a "cancel" at a higher level of abstraction: close the handle or allow the operation to complete, and ignore the result.

Is it possible to force an asynchronous read operation initiated by FileStream.BeginRead to finish early, without errors?

The Question

Background

Use Case

About the Documentation

3 Answers3