Windows thread-switching latency after IO completion - microseconds or milliseconds

Question

I am trying to determine approximate time delay (Win 7, Vista, XP) to switch threads when an IO operation completes.

What I (think I) know is that:

a) Thread contex switches are themselves computationally very fast. (By very fast, I mean typically way under 1ms, maybe even under 1us? - assuming a relatively fast, unloaded machine etc.)

b) Round robin time slice quantums are on the order of 10-15ms.

What I can't seem to find is information about the typical latency time from a (high priority) thread becoming active/signaled - via, say, a synchronous disk write completing - and that thread actually running again.

E.g., I have read in at least one place that all inactive threads remain asleep until ~10ms system quantum expires and then (assuming they are ready to go), they all get reactivated almost synchronously together. But in another place I read that the delay between when a thread completes an I/O operation and when it becomes active/signaled and runs again is measured in microseconds, not milliseconds.

My context for asking is related to capture and continuous streaming write to a RAID array of SSDs, from a high speed camera, where unless I can start a new write after a prior one has finished in well under 1ms (it would be best if under 1/10ms, on average), it will be problematic.

Any information regarding this issue would be most appreciated.

Thanks, David

Note that millisecond-times can be measured in microseconds as well, just add three more zeroes ;-) — John Dvorak, Dec 07 '12 at 19:24

score 6 · Accepted Answer · answered Dec 07 '12 at 19:51

6

Thread context switches cost between 2,000 and 10,000 cpu cycles, so a handful of microseconds.

An I/O completion is fast when a thread is blocking on the synchronization handle that signals completion. That makes the Windows thread scheduler temporarily boost the thread priority. Which in turn makes it likely (but not guaranteed) to be chosen as the thread that gets the processor love. So that's typically microseconds, not milliseconds.

Do note that disk writes normally go through the file system cache. Which makes the WriteFile() call a simple memory-to-memory copy that doesn't block the thread. This runs at memory bus speeds, 5 gigabytes per second and up. Data is then written to the disk in a lazy fashion, the thread isn't otherwise involved or delayed by that. You'll only get slow writes when the file system cache is filled to capacity and you don't use overlapped I/O. Which is certainly a possibility if you write video streams. The amount of RAM makes a great deal of difference. And SSD controllers are not made the same. Nothing you can reason out up front, you'll have to test.

answered Dec 07 '12 at 19:51

Hans Passant

922,412
146
1,693
2,536

Thanks Hans, You write: I/O completion fast - usecs. You may have answered my question, but to confirm: 1) Currently, async disk write initiated in infinite "while" loop wt. Overlapped IO, & polled for completion next time through loop. Too slow at 1000fps... (FYI: Streaming video fills system RAM & SSD RAID card RAM immediately.) 2) Was thinking of FIFO buf, pushed from main thread, popped & written (sync?) in 2nd thread. 3) Question: When the 2nd thread write completes, will that thread "instantly"(us) grab control to initiate another write? (Can make 2nd thread high priority.) Thx, D – TechnoFrolics Dec 07 '12 at 22:24
A thousand video frames per second? That's a serious fire hose. At least set a realistic goal by measuring the bytes/sec you can write to the SSD and divide by the avg bytes in a frame. That's your upper limit on frame rate. As long as you **never** exceed it you'll have a shot at buffering help you reach that limit. Go faster and no amount of buffering will do any good. – Hans Passant Dec 07 '12 at 22:44
Come to think of it, you are very seriously abusing that SSD drive. They are good at reading, excellent at seeking, they don't like writes. You'll kill that puppy in a month or less. – Hans Passant Dec 07 '12 at 22:59
All good thoughts. I have already confirmed that raw bandwidth about twice what I need (well over 1 GigaByte/second sustained). Re SSD lifetime - yes, you are quite correct that if left running at that rate, MLC drives have limited life. (Helps to have large size drives so cycles/cell minimized. And currently in 8 drive RAID array, so... :-) Does your response mean that in your estimation thread-switch response to write-complete event (in high priority and/or boosted priority thread) indeed likely in microsecond range? Thanks much, D. – TechnoFrolics Dec 07 '12 at 23:38
PS: And it sounds from your comment of "An I/O completion is fast when a thread is blocking on the synchronization handle that signals completion." that I should make secondary thread write indeed be synchronous and blocking. (This is easier than some alternatives anyway, and seems to make more sense than a callback etc.) D – TechnoFrolics Dec 08 '12 at 20:56
Hans, A couple of final comments for now (and I have selected your answer above): 1) I have not yet implemented a 2nd thread yet (been implementing FIFO in main thread first to allow easier debugging, and/or may be able to push at more than 1 place in main thread while loop). But on thinking about your caching comment further: Given the OS is caching, my concern about latency between pushes (assuming OS write cache fills at application start, which I think it does), on the face of it, makes no sense - that would only make sense if there were no cache (and I certainly agree there is). – TechnoFrolics Dec 11 '12 at 13:08
So wondering if problem is "lazy" aspect of OSes cache flush. I.e, when cache not 100% full, OS writes only say every few ms and SSDs idle in between, whereas if 100% full, OS writes packets more quickly and SSD kept near 100% busy, and thus goal is to keep cache full. 2) A while back I mentioned to colleague limited life of SSDs (as you noted). His response: With high speed camera capture, SSDs considered "disposable media" :-). (Having read-only "failure" mode.) If I learn anything further worth posting, I will. Thanks, David – TechnoFrolics Dec 11 '12 at 13:09

Windows thread-switching latency after IO completion - microseconds or milliseconds

1 Answers1