3

I am currently developing a Windows Kernel Driver that implements its own networking stack. While testing some base functionality of the implemented stack, I noticed that replies to pings would sometimes take noticeably longer than usual. Investigating this issue further, I found out that KeAcquireSpinLock sporadically has an execution time of up to 20 ms (instead of few µs), even when the lock is not held by other cores (I confirmed this by printing the lock value before calling KeAcquireSpinLock).

Since I had no clue why KeAcquireSpinLock takes so long, I implemented a different approach with KeAcquireSpinLockAtDpcLevel, manually rising the IRQL if required:

oldIrql = KeGetCurrentIrql();
if (oldIrql < DISPATCH_LEVEL)
{
  KeRaiseIrql(DISPATCH_LEVEL, &oldIrql);
}

KeAcquireSpinLockAtDpcLevel(m_lock);

// DO STH WITH SHARED RECOURCES

KeReleaseSpinLockFromDpcLevel(m_lock);
if (oldIrql< DISPATCH_LEVEL) KeLowerIrql(oldIrql);
      

I expected the above code to be functionally equivalent to KeAcquireSpinLock. However, it turned out that the runtime issue I had with KeAcquireSpinLock is gone and performance is fine with this approach.

I have searched the internet for similar problems with KeAcquireSpinLock, but it seems like I am alone with this issue. Maybe I have a bug in other sections of the driver? Can someone explain this behavior?

Note that I am not talking about Deadlocks, since KeAcquireSpinLock would always return at some point and the implementation with KeAcquireSpinLockAtDpcLevel uses the same architecture / locking object.

ApiTiger
  • 31
  • 2
  • Can you put together an [mre] that demonstrates the problem? – stark Oct 26 '21 at 17:45
  • @HansPassant I understand that IRQL affects performance. However, according to microsoft doc, `KeAcquireSpinLock` raises the IRQL to DISPATCH_LEVEL as well, just like I did manually in my code above. This is why I would expect at least an equivalent behavior with both implementations. – ApiTiger Oct 26 '21 at 18:54
  • in *x64* `KeAcquireSpinLock` at all - macro - expanded to `*(OldIrql) = KeAcquireSpinLockRaiseToDpc(SpinLock)`, `KeGetCurrentIrql()` not need to call - `KeAcquireSpinLockRaiseToDpc` by self return this – RbMm Oct 26 '21 at 23:17
  • @RbMm yes, `KeAcquireSpinLock` is doing the same thing as my code above - rising IRQ level and acquiring spinlock. But why does ` KeAcquireSpinLock` sometimes take 20 ms to complete while the above code always takes a few nanoseconds? – ApiTiger Oct 27 '21 at 10:24
  • are you on x86 system ? on x64 not exist any `KeAcquireSpinLock` at all – RbMm Oct 27 '21 at 10:33
  • @RbMm I am on x64, so I am using the macro which expands to `KeAcquireSpinLockRaiseToDpc` – ApiTiger Oct 27 '21 at 11:21
  • Are you using KeQueryPerformanceCounter to measure time? Is this running in a VM? – Jeffrey Tippet Oct 27 '21 at 22:17
  • @JeffreyTippet Since 20 ms is quite a lot of time, the timestamps of DbgPrintEx were sufficient to identify to problem, but I have also used KeQueryPerformanceCounter to calculate the runtime. I am not running in a VM, I am using a separate test computer via RemoteDesktop to run tests. – ApiTiger Oct 28 '21 at 07:51

0 Answers0