The program uses the Faster library to store some kv and spilling to disk.
The code is trying to wait for Faster to complete any pending operations after writing all the entries with a timeout. It invokes CompletePending and busy waits with Thread.Yield() until CompletePending() returns true.
bool done = false;
while(!done) {
done = this.fasterSession.CompletePending(wait: false); // this tells Faster to complete all pending operations
if (timeElapsed > timeout) {
// .. throw timeout exception
}
Thread.Yield();
}
This ran fine previously on framework 471, usually finishing within seconds (at most within a minute). However on NET 6 this started to take longer than 5 minutes to complete.
Removing the timeout and yield altogether resolved the issue, however later in the code there is
this.faster.log.DisposeFromMemory(); // internally busy waits with Thread.Yield()
Which also used to take few seconds at most, but now takes around 10-20 minutes to complete. The only thing these two share in common is they are both invoking Thread.Yield().
IO metrics on the machines look good. The issue is reproducible consistently.
The strangest thing the memory dumps revealed is that the operations (either CompletePending or DisposeFromMemory) finishes instantly after the memory dump is taken. From what I understand taking a memory dump pauses all the threads then resumes them.
Nothing interesting shows up on the memory dumps themselves. !clrstack shows only the Thread.Yield() or other parents threads waiting. There is an ObjectDisposedException on one of the IOCP threads. The stacktrace is not in our code and seems to involve some SslStream.
0:058> !PrintException /d 000001e8488d1d10
Exception object: 000001e8488d1d10
Exception type: System.ObjectDisposedException
Message: Cannot access a disposed object.
InnerException: <none>
StackTrace (generated):
SP IP Function
000000A6B3A7EED0 00007FF8AE0EEB25 System_Net_Security!System.Net.Security.SslStream.<ThrowIfExceptional>g__ThrowExceptional|138_0(System.Runtime.ExceptionServices.ExceptionDispatchInfo)+0x55
000000A6B3A7EF00 00007FF625647910 System_Net_Security!System.Net.Security.SslStream.DecryptData(Int32)+0x130
000000A6B3A7EFE0 00007FF62560EFCC System_Net_Security!System.Net.Security.SslStream+<ReadAsyncInternal>d__188`1[[System.Net.Security.AsyncReadWriteAdapter, System.Net.Security]].MoveNext()+0x66c
!dlk
shows nothing of note. !runaway
shows thread running yield taking 5 minutes, which is more or less expected.
At this stage I have tried playing around with different configurations, it seems that Thread.Yield() seems to be the culprit. Changing it Thread.Sleep(100)
will cause the operations to complete within a minute. Thread.Sleep(0)
or using SpinWait
does not work. Upgrading the Faster library to latest version also did not help. I'm struggling to understand why Thread.Yield() is causing the delay, and why taking a memory dump solves it.