Busy Loop/Spinning sometimes takes too long under Windows

Question

I'm using a windows 7 PC to output voltages at a rate of 1kHz. At first I simply ended the thread with sleep_until(nextStartTime), however this has proven to be unreliable, sometimes working fine and sometimes being of by up to 10ms.

I found other answers here saying that a busy loop might be more accurate, however mine for some reason also sometimes takes too long.

while (true) {
        doStuff();  //is quick enough
        logDelays();

        nextStartTime = chrono::high_resolution_clock::now() + chrono::milliseconds(1);

        spinStart = chrono::high_resolution_clock::now();

        while (chrono::duration_cast<chrono::microseconds>(nextStartTime - 
                         chrono::high_resolution_clock::now()).count() > 200) {
            spinCount++; //a volatile int
        }
        int spintime = chrono::duration_cast<chrono::microseconds>
                              (chrono::high_resolution_clock::now() - spinStart).count();

        cout << "Spin Time micros :" << spintime << endl;

        if (spinCount > 100000000) {
            cout << "reset spincount" << endl;
            spinCount = 0;
        }

}

I was hoping that this would work to fix my issue, however it produces the output:

  Spin Time micros :9999
  Spin Time micros :9999
  ...

I've been stuck on this problem for the last 5 hours and I'd very thankful if somebody knows a solution.

What are you actually trying to do? Sleep for an exact amount of time? — nwp, Sep 04 '17 at 14:36
What sort of hardware are you using to output? Having precise timing in operating systems like Windows in user space is generally pretty much impossible, because your thread might not be scheduled at all, even if the CPU is mostly idle, at the exact moment you want your thread might not be running. On a more busy system, a spinloop even makes that more likely by burning up your resource allocation. In some previous testing I did, even a thread with "realtime" priority in a spinloop on Windows 10 seemed to get interrupted (didnt investigate, but could see it getting switched to other CPU cores) — Fire Lancer, Sep 04 '17 at 14:42
@nwp Yes, My goal is to let the function doStuff() run every Millisecond, as it sends the voltage data to a DataAquisition Card — Mefaso, Sep 04 '17 at 14:47
@FireLancer I'm using a Data-Acquisition card from NI, however since the data depends on real-time input into the pc the computation can't be moved away from that pc. The computer is running with 2-3% cpu usage. — Mefaso, Sep 04 '17 at 14:50
I don't understand what you are doing with `spincount` and why it is `volatile`. Your process can get interrupted any time by windows for various reasons, so you can't guarantee exact timing on a non-realtime OS. Just use something like `auto delay = std::chrono::milliseconds(1); auto target_time = now(); while (target_time < now() + delay){} target_time += delay;` and be done with it. It will not get more exact than that. — nwp, Sep 04 '17 at 14:56
@Mefaso, so you are reading as well as writing the voltages at 1KHz? They got any API docs, I couldnt see at a quick glance? But what id want to see is if they can read and write samples to buffers, like you might audio (e.g. even games and other stuff generating audio on the fly, would send off thousands of samples at a time for 48KHz and let the driver/hardware deal with the precise timing, and same for microphone etc. input) — Fire Lancer, Sep 04 '17 at 15:04
@FireLancer Right, they can be sent in buffers, however I'm reading data from the cursor position, which is not directly available to the DAQ. So I'm sending these buffers once a millisecond to be able to produce a refresh rate from the cursor to the voltage output at 1 kHz. — Mefaso, Sep 04 '17 at 15:15
@nwp I will try that, thank you. I saw it in another post about busy loops and thought it shouldn't hurt and maybe prevent optimizations ruining my loop. — Mefaso, Sep 04 '17 at 15:16
@nwp Sadly that produces the exact same result. Thank you anyways. — Mefaso, Sep 04 '17 at 15:29
[Cannot reproduce](http://coliru.stacked-crooked.com/a/fe761f26dff8834e). You are, however, using `std::cout` and `std::endl;` which forces a flush of the output buffer onto the screen. Try to put the `spinCount` values into a `std::vector` and print them after the run to avoid being blocked by IO. — nwp, Sep 04 '17 at 15:39
@nwp Thank you so much, it works now with your code. If you want to post this as an answer I'd be glad to accept it. — Mefaso, Sep 04 '17 at 16:21

score 0 · Answer 1 · answered Sep 04 '17 at 15:29

On Windows I dont think its possible to ever get such precise timing, because you can not garuntee your thread is actually running at the time you desire. Even with low CPU usage and setting your thread to real time priority, it can still be interuptted (Hardware interupts as I understand. Never fully investigate but even a simple while(true) ++i; type loop at realtime Ive seen get interupted then moved between CPU cores). While such interrupts and switching for a realtime thread is very quick, its still significant if your trying to directly drive a signal without buffering.

Instead you really want to read and write buffers of digital samples (so at 1KHz each sample is 1ms). You need to be sure to queue another buffer before the last one is completed, which will constrain how small they can be, but at 1KHz at realtime priority if the code is simple and no other CPU contention a single sample buffer (1ms) might even be possible, which is at worst 1ms extra latency over "immediate" but you would have to test. You then leave it up to the hardware and its drivers to handle the precise timing (e.g. make sure each output sample is "exactly" 1ms to the accuracy the vendor claims).

This basically means your code only has to be accurate to 1ms in worst case, rather than trying to persue somthing far smaller than the OS really supports such as microsecond accuracy.

As long as you are able to queue a new buffer before the hardware used up the previous buffer, it will be able to run at the desired frequency without issue (to use audio as an example again, while the tolerated latencies are often much higher and thus the buffers as well, if you overload the CPU you can still sometimes hear auidble glitches where an application didnt queue up new raw audio in time).

With careful timing you might even be able to get down to a fraction of a millisecond by waiting to process and queue your next sample as long as possible (e.g. if you need to reduce latency between input and output), but remember that the closer you cut it the more you risk submitting it too late.

score 0 · Accepted Answer · answered Sep 04 '17 at 16:33

According to the comments this code waits correctly:

auto start = std::chrono::high_resolution_clock::now();
const auto delay = std::chrono::milliseconds(1);
while (true) {
    doStuff();  //is quick enough
    logDelays();

    auto spinStart = std::chrono::high_resolution_clock::now();
    while (start > std::chrono::high_resolution_clock::now() + delay) {}
    int spintime = std::chrono::duration_cast<std::chrono::microseconds>
                          (std::chrono::high_resolution_clock::now() - spinStart).count();

    std::cout << "Spin Time micros :" << spintime << std::endl;
    start += delay;
}

The important part is the busy-wait while (start > std::chrono::high_resolution_clock::now() + delay) {} and start += delay; which will in combination make sure that delay amount of time is waited, even when outside factors (windows update keeping the system busy) disturb it. In case that the loop takes longer than delay the loop will be executed without waiting until it catches up (which may be never if doStuff is sufficiently slow).

Note that missing an update (due to the system being busy) and then sending 2 at once to catch up might not be the best way to handle the situation. You may want to check the current time inside doStuff and abort/restart the transmission if the timing is wrong by more then some acceptable amount.

Busy Loop/Spinning sometimes takes too long under Windows

2 Answers2