-1

I'm trying to disable all interrupts including NMI's on a single core in a processor and put that core into an infinite loop with a JMP instruction targeting itself (bytecode 0xEBFE) I tried this with the following machine code:

cli
in al, 0x70
mov bl, 0x80
or al, bl
out 0x70, al
jmp self (0xEBFE)

I assumed that disabling NMI interrupts would also disable the watchdog since according to this link the watchdog timer is an NMI interrupt, but what happened when I ran this code is after around 5 seconds my computer bugchecked with code 0x101 CLOCK_WATCHDOG_TIMEOUT. I'm wondering if windows notices that I've disabled NMI interrupts and then re-enables them before initiating the kernel panic. Does anyone know how to disable the watchdog timer in windows 7?

  • I can't come up with a single good reason to do this. – Joshua Mar 29 '19 at 00:22
  • 1
    @Joshua: high-precision micro-benchmarking is the obvious use-case for disabling the watchdog timer. But then running an empty infinite loop seems pointless, agreed. – Peter Cordes Mar 29 '19 at 03:43
  • In theory one could boot Windows on fewer CPUs than the system has (system configuration->boot->advanced options), but then one would need to boot the rest of the CPUs manually. – Alexey Frunze Mar 29 '19 at 03:51
  • @AlexeyFrunze I've already done that, I'm trying to learn more about the internals of my hardware and that's the point of why I'm doing this. – KeepForgettingMyUserName Mar 29 '19 at 04:47
  • And what are the results of doing that, if any? – Alexey Frunze Mar 29 '19 at 04:58
  • @AlexeyFrunze I'm not quite sure about what you're asking. If you mean the results of enabling one core, it successfully hangs the system without a kernel panic. – KeepForgettingMyUserName Mar 29 '19 at 09:43
  • Successfully? Is it what you do/expect? Hangs the entire system, not just that one CPU? – Alexey Frunze Mar 29 '19 at 10:05
  • I don't think the NMIs are the problem here, I'm writing an answer on that. To be sure, if you remove the NMI masking code, will the watchdog timer still bugcheck the system? – Margaret Bloom Mar 29 '19 at 10:17
  • @MargaretBloom yes if I remove the NMI code it still bugchecks the system with the same code, I actually initially didn't have the NMI code when I first tried this. – KeepForgettingMyUserName Mar 30 '19 at 04:32
  • @AlexeyFrunze I think there was a misunderstanding, I'm trying to lock up a single core for nothing other then the sake of locking up a single core. The fact that the OS interferes and bugchecks the system is something I'd like to avoid, what if in the future I want to run code with interrupts disabled for an extended period of time? This is why I feel it's necessary for me to learn this kind of stuff. – KeepForgettingMyUserName Mar 30 '19 at 04:33
  • Windows is not quite designed for that. Write your own (mini)OS. Or hack Linux. – Alexey Frunze Mar 30 '19 at 06:41
  • So it's not an HW WDT. Windows is probably requiring each logical CPU to report their ticks. Linux supports CPU hotplugging, Windows *should* too but maybe only when para-virtualised. – Margaret Bloom Mar 30 '19 at 07:19

1 Answers1

3

I don't think the NMIs are at fault here.

External NMIs are obsolete, they are hard to route in an SMP system. That watchdog timer is also obsolete, it was either a secondary PIT or a limited fourth channel of the primary PIT:

----------P00440047--------------------------
PORT 0044-0047 - Microchannel - PROGRAMMABLE INTERVAL TIMER 2
SeeAlso: PORT 0040h,PORT 0048h

0044  RW  PIT  counter 3 (PS/2)
        used as fail-safe timer. generates an NMI on time out.
        for user generated NMI see at 0462.
0047  -W  PIT  control word register counter 3 (PS/2, EISA)
    bit 7-6 = 00  counter 3 select
        = 01  reserved
        = 10  reserved
        = 11  reserved
    bit 5-4 = 00  counter latch command counter 3
        = 01  read/write counter bits 0-7 only
        = 1x  reserved
    bit 3-0 = 00
----------P0048004B--------------------------
PORT 0048-004B - EISA - PROGRAMMABLE INTERVAL TIMER 2
Note:   this second timer is also supported by many Intel chipsets
SeeAlso: PORT 0040h,PORT 0044h

0048  RW  EISA PIT2 counter 3 (Watchdog Timer)
0049  ??  EISA 8254 timer 2, not used (counter 4)
004A  RW  EISA PIT2 counter 5 (CPU speed control)
004B  -W  EISA PIT2 control word

These hardware is gone, it's not present on modern systems. I've tested my machine and I don't have it.
Intel chipsets don't have it:

No secondary PIT

There is only the primary PIT.

Modern timers are the LAPIC timer and the HPET (Linux did even resort to using the PMC registers).


Windows does support an HW WDT, in fact Microsoft went as long as defining an ACPI extension: the WDAT table.

This WDT however can only reboot or shutdown the system, in hardware, without the intervention of the software.

// Configures the watchdog hardware to perform a reboot  
// when it is fired.
//
#define WATCHDOG_ACTION_SET_REBOOT 0x11
//
// Determines if the watchdog hardware is configured to perform 
// a system shutdown when fired.
//
#define WATCHDOG_ACTION_QUERY_SHUTDOWN 0x12
//
// Configures the watchdog hardware to perform a system shutdown 
// when fired. 
//
#define WATCHDOG_ACTION_SET_SHUTDOWN 0x13

Microsoft set quite a quit of requirement for this WDT since it must be setup as early as possible in the boot process, before the PnP enumeration (i.e. PCI(e) enumeration).

This is not the timer that bugchecked your system. By the way, I don't have this timer (my system is missing the WDAT table) and I don't expect it to be found on client hardware.


The bugcheck 0x101 is due to a software WDT, it is raised inside a function in ntoskrnl.exe.
This function is called by KeUpdateRunTime and by another chain of calls starting in DriverEntry:

xrefs of wdt

According to Windows Internals, KeUpdateRunTime is used to update the internal ticks counting of Windows.
I'd expect only a single logical processor to be put in charge of that, though I'm not sure of how exactly Windows housekeeps time.

I'd also expect this software WDT to be implemented in a master-slave fashion: each CPU increments its own counter and a designed CPU check the counters periodically (or any equivalent implementation).

This seems to be suggested by the wording of the documentation of the 0x101 bugcheck:

The CLOCK_WATCHDOG_TIMEOUT bug check has a value of 0x00000101. This indicates that an expected clock interrupt on a secondary processor, in a multi-processor system, was not received within the allocated interval.

Again, I'm not an expert on this part of Windows (The user MdRm, probably is) and this may be utterly wrong, but if it isn't you probably are better of following Alex's advice and boot with one less logical CPU.
You can then execute code on that CPU with an INIT-SIPI-SIPI sequence as described on the Intel's manual but you must be careful because the issuing processor is using paging while the sleeping one is not yet (the processor will start up in real-mode).

Initialising a CPU may be a little cumbersome but not too much after all.
Stealing it may result in other problems besides the WDT, for example if Windows has routed an interrupt to that processor only.

I don't know if there is driver API to unregister a logical processor, I found nothing looking at the exports of hal.dll and ntoskrnl.exe.

Margaret Bloom
  • 41,768
  • 5
  • 78
  • 124