1

I'm looking for Fresh Response to the outdated responses (and contradictions) I found here and elsewhere. I need to use REALTIME_PRIORITY_CLASS, THREAD_PRIORITY_TIME_CRITICAL, on a few threads. The Responses here and elsewhere are similar to the following;

"Caution: if you are asking for true realtime priority, you are going to get it. This is a nuke. The OS will mercilessly prioritize a realtime priority thread, well above even OS-level input processing, disk-cache flushing, and other high-priority time-critical tasks."

AND...

"This may also cause high temperatures on the CPU and even burn out the CPU or motherboard if the cooling system is not designed for prolonged maximum CPU usage."

ASSUMPTION - Yes I can see that on a Dual Core Processor. I have a Ryzen 7 (8 Cores, 16 Threads). The software being developed is designed to test samples on nothing less. Worst Case Senerio, I would think the OS, would function as if it were running on a Quad core (I'm utilizing 4 threads). And as far as the Burn out, AMD has an unlock feature on some of these processors to push overclocking abilities safely. I kind of doubt burn out. BUT I COULD BE WRONG, Please feel free to correct me.

WHY: The software processes Group of samples. Each group must be processed so the results can be passed on to the next group. The Threads are released upon the group and then they can relax, while the next group set.

The software NOW is running strictly on Atomic Variables used as switches. I had one last bottleneck that was requiring 2 Spinlocks/Mutex (Because windows context out the thread in the critical section - or worse). REALTIME_PRIORITY_CLASS would have resolved it, but An Array of Atomic_ints used as Latches/switches resolved that problem of how to handle straggling threads, a little more safely. Examples of Atomic Switches.

if (Worker.fetch_add(1) == 0) //FIRST THREAD IN
   DoFirstThreadJob(0);

if (Worker.fetch_sub(1) == 1) // LAST THREAD OUT
   DoCleanUpJob(1);


    /// QUICK TRY LOCK
if (!OnlyOneAtATime.fetch_add(1))
{
   SetupStuff()
}
OnlyOneAtATime.fetch_sub(1);


// MULTI THREAD INDEXER SO EACH THREAD GETS ITS OWN TASK
LocalIndex = AtomicIndexer.fetch_add(1);
while (LocalIndex < GlobalMaxBatchCount)
{
    DoBatchAssignment(LocalIndex);
    LocalIndex = AtomicIndexer.fetch_add(1);    
}

SO THE QUESTION(S) REMAINS;

  1. Our experiment of realtime, kept the threads tight (NOT - 1 or 2 threads, 1..3 levels down from the current batches). If I keep them tight, I can take out a wait loop, and let them start on the next series of test (free flow). The computer ran smoothly on all-out FULL-BLOWN NUKE of 4 threads. BUT 10-15 minute test is not 3 or 4 hours. Will someone wake me in the morning and say a blob of Plastic is sitting where the computer was (I'm trying to stay Amused)?

  2. The following Example Shows one test of Switching Out. If I have to play it safe, how much time am I loosing, switching back and forth? (As mentioned, we had this cranked all the way up, using 4 threads (1/2 the core count) NO SWITCHING OUT, and the computer ran smoothly).

The Threads would go through "Manager" and “Class” Bumped UP to "REALTIME", when they return Class is bumped down. Right before they Process a batch, the Thread Priority would get Bumped UP to "TIME_CRITICAL", then "NORMAL" upon its return.

    void DoBatchWork(BatchResultType ResultList)
    {
        ///////////////////////////////////////////////////// START CRITICAL ZONE
        if (!SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL))
            ErrorLog(103, GetLastError());
        //////////////////////////////////////////////////////START CRITICAL ZONE

        GetResultsOfBatchTest(ResultList);

        ///////////////////////////////////////////////////// END CRITICAL ZONE
        if (!SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_NORMAL))
            ErrorLog(104, GetLastError());
        //////////////////////////////////////////////////////END CRITICAL ZONE
    }

    void Manager()
    {
        if (!SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS))
            Log(101, GetLastError());
        /////////////////////////////////////////////////////////////////////

        WorkOnAllBatches();

        /////////////////////////////////////////////////////////////////////
        if (!SetPriorityClass(GetCurrentProcess(), HIGH_PRIORITY_CLASS))
            Log(102, GetLastError());
    }

WHAT HAVE I DONE... You can see there are a good 1000+ combinations to try. If SOMEONE here has pushed the limits, please chime in. I could be doing this for several days, when the HIGHEST (APPEARS) to be okay.

SIDE NOTE: I see some response on other questions was to dedicate the cores (affinity). I also see CONTRADICTION on other questions about "how to affinity", that you don't want to do that, Windows manages things such as overheating, etc. IN TRUTH, the CPU (if it does) and the OS can switch the threads to different cores and manage, thread-to-core (Better than I can). I'm okay with that. I'm focused on lengthy "Time sliced threads". SO I'll let windows and the processor to do what it does best.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Adrian E
  • 29
  • 6
  • 1
    You're right, the concerns about burning out your CPU are nonsense, especially on CPUs from the last 15 years. Real-time scheduling priority doesn't force max turbo; the hardware will still throttle the clock speed if it gets too hot. Exactly the same as if something like Prime95 is running at normal priority and no other tasks are waiting for CPU time: the CPU will spend essentially all its time running 2/clock FMA instructions, the highest-power workload possible. Even if there was a difference like 99.9% vs. 99.99% of time spent in high-heat code, it wouldn't make a difference. – Peter Cordes Aug 11 '23 at 21:51
  • 1
    As for breaking the system by not leaving CPU time for kernel tasks, yes hopefully running fewer threads than you have logical cores will avoid that. Some things probably need to run on each core, but hopefully only things that run inside actual interrupt handlers or at even higher than realtime priority. I don't know Windows kernel details. I assume Linux schedules interrupt-handler bottom-halves with a priority above realtime so they don't get starved. I also wonder about RCU `run_on`; hopefully that doesn't starve either, like it does when you disable interrupts on a core for a long time. – Peter Cordes Aug 11 '23 at 21:58
  • 1
    Those first two questions seem different and more general than the stuff with spin loops you're asking about later. If there's another specific question in there, that should maybe be a separate question. – Peter Cordes Aug 11 '23 at 21:59
  • @PeterCordes - Thanks for the quick response. This computer felt warm when not threaded - but all the cores sat at practically idle - and - 30% usage when reviewing the task manager. THANK YOU for confirmation on that end. Thats the only question. I was only sharing ideas on using atomics, where the Spinlock/mutex was mentioned. That issue is resovled. This all runs smoothly without tying up threads with mutex/spinlocks. – Adrian E Aug 11 '23 at 22:03
  • @PeterCordes - I checked the Resource manager. Now 4 Cores are running full. The 4 other cores are now in use, not heavy, so windows must have decided to use those. I'm thinking I don't have to worry about OS/Hardware task. – Adrian E Aug 11 '23 at 22:23
  • The things to worry about are "priority inversions", "starvation" and "unfair scheduling". – Jesper Juhl Aug 12 '23 at 00:16
  • @JesperJuhl - Would you please give more details. Starvation doesn't appear to be a problem. 5-6 cores at any period of time sat practically on idle. When I ran this with 4 Threads, they finally moved, looking normal, while 4 cores jumped up. "Priority Inversion" and "Unfair Scheduling", please explain. I choose NOT to dedicate Cores and let Windows OS, handle core swapping, and let it handle underlying issues (e.g. lower priority taking precedent over higher priority thread of its own). It has 4 cores, 8 hardware threads to work with. – Adrian E Aug 12 '23 at 00:39

1 Answers1

3

You're right, the concerns about burning out your CPU are nonsense, especially on CPUs from the last 15 years. Real-time scheduling priority doesn't force max turbo; the hardware will still throttle the clock speed if it gets too hot. Exactly the same as if something like Prime95 is running at normal priority and no other tasks are waiting for CPU time: the CPU will spend essentially all its time running 2/clock FMA instructions, the highest-power workload possible. Even if there was a difference like 99.9% vs. 99.99% of time spent in high-heat code, it wouldn't make a difference.

As for breaking the OS by not leaving CPU time for kernel tasks, yes hopefully running fewer threads than you have logical cores will avoid that. Some things probably need to run on each or every core, but hopefully only things that run inside actual interrupt handlers or at even higher than realtime priority.

I don't know Windows kernel details. I assume Linux schedules interrupt-handler bottom-halves with a priority above realtime so they don't get starved. I also wonder about Linux RCU run_on; hopefully that doesn't starve either. (That does happen when you disable interrupts on a core for a long time, leading to the whole system basically locking up and not processing keyboard input.)

If there were going to be problems of this nature, I expect you'd notice within minutes, if I/O buffers were filling up and not getting flushed to disk or something. Maybe longer if you had lots of RAM and/or not much I/O was happening. I don't expect any problems like that, as long as you run fewer realtime threads than logical cores so there's somewhere for kernel tasks to be scheduled.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847