-2

I am experiencing on Windows 10 a situation in which the same code, either compiled with CUDA 9.2 backend (using nvcc with cl.exe) or compiled with OpenMP backend (using g++ provided by MinGW), returns different numerical results. The CUDA one is correct, while instead the OpenMP has some broken samples.

Right now I'm unable to tell what is happening and also unable to find any similar situation on the web, but I am sure that it's because I am doing something stupid.

I am trying to generate a decent isolated code right now, but please, are there any rookie mistakes that can cause such kinds of errors?

1 Answers1

1

In the end, the problem was that my functor had internal variables that were causing data racing in OpenMP, but not in CUDA. Shame on me.