I am developing a CUDA application for GTX 580 with Visual Studio 2010 Professional on Windows 7 64bit. My project builds fine with CUDA Toolkit 4.0, but nvcc crashes when I choose CUDA Toolkit 4.1 or 4.2 with the following error:
1> Stack dump:
1> 0. Running pass 'Promote Constant Global' on module 'moduleOutput'.
1>CUDACOMPILE : nvcc error : 'cicc' died with status 0xC0000005 (ACCESS_VIOLATION)
Strangely enough, the program compiles OK with "compute_10,sm_10" specified for "Code Generation", but "compute_20,sm_20" does not work. The code in question can be downloaded here:
http://www.meriken2ch.com/files/CUDA_SHA-1_Tripper_MERIKENs_Branch_0.04_Alpha_1.zip
(README.txt is in Japanese, but comments in source files are in English.)
I am suspecting a newly introduced bug in CUDA Toolkit 4.1/4.2. Has anybody encountered this issue? Is there any workaround for it? Any kind of help will be much appreciated.