0

I'm porting C code to HLSL (compute shader). The compiler is crazy with one of the for loops. At runtime, the display device driver detect an unreasonable amount of time to execute the code.

Here is the partial source code with the offending for loop:

P = FloatToAsciiNoExponent(F * Factor, DstBuf, 7);
uint TmpBuf[FTOA_BUFFER_SIZE];
uint BytesWritten = IntToAscii(Exp10, TmpBuf, BASE10);
DstBuf[P++] = 'E';
[fastopt]
for (uint I = 0; I < BytesWritten; I++)
    DstBuf[P++] = TmpBuf[I];

At run time, I got the following debug message:

D3D11 ERROR: ID3D11Device::RemoveDevice: Device removal has been triggered for the following reason (DXGI_ERROR_DEVICE_HUNG: The Device took an unreasonable amount of time to execute its commands, or the hardware crashed/hung. As a result, the TDR (Timeout Detection and Recovery) mechanism has been triggered. The current Device Context was executing commands when the hang occurred. The application may want to respawn and fallback to less aggressive use of the display hardware). EXECUTION ERROR #378: DEVICE_REMOVAL_PROCESS_AT_FAULT]

If I comment out the two for-loop lines, everything is OK (Except of course the final result which lacks his last part).

FloatToAsciiNoExponent() is a function which convert his first argument into a list or ascii code stored in the second argument (an array of uint). The last argument is the numeration base for conversion. It has been validated.

IntToAscii() is a function converting his first argument into a list of ascii code stored in the second argument (an array of uint). It has been validated.

The original C source code I'm porting can be found here: https://searchcode.com/codesearch/view/14753060/

I'm running on Windows 7 and DirectX SDK of June 2010 (The last one running on Windows 7). Windows update has been executed and every update installed. The graphic card is an NVidia Quadro K4200 having 24GB of RAM with driver version 431.02.

Any help appreciated.

Doug Richardson
  • 10,483
  • 6
  • 51
  • 77
fpiette
  • 11,983
  • 1
  • 24
  • 46

2 Answers2

1

With DirectCompute you still need to make sure each instance completes in a reasonable time or you will hit TDR timeouts (~2 seconds).

See Microsoft Docs

With DirectX 11.1 (Windows 8 or later), you can use D3D11_CREATE_DEVICE_DISABLE_GPU_TIMEOUT to give you a bit more time for a long-running DirectCompute shader, but you can make the system unresponsive.

You can install a partial version of DirectX 11.1 on Windows 7 Service Pack 1 via KB2670838

You should also read this blog post for more up-to-date information about the legacy DirectX SDK.

UPDATE Apparently this was actually an HLSL compiler bug with the legacy DirectX SDK. Note that you can and should use the latest Windows 10 SDK HLSL compiler even for DirectX 11 on Windows 7. See this blog post and Microsoft Docs.

Chuck Walbourn
  • 38,259
  • 2
  • 58
  • 81
  • Thanks. Yes, I know. That is not the issue. The code in the loop is executed at most 16 times and is just an uint copy from one array to another. Nothing that takes a significant time. My guess is that somehow the compiler generate wrong code. KB2670838 is already installed on my system which as I said is a WIN7 fully up-to-date. – fpiette Aug 28 '19 at 05:51
  • I upgraded my PC to Win10 which includes DirectX12. Now the source code is working as expected. This confirm that the June 2010 compiler is bugged. – fpiette Aug 30 '19 at 12:42
1

Answering myself:

I upgraded my PC to Win10 which includes DirectX12. Now the source code is working as expected. This confirm that the June 2010 compiler is bugged.

fpiette
  • 11,983
  • 1
  • 24
  • 46