I have a code which is mainly developed using OpenACC, I would like to compare P100 results with OpenACC to Intel's KNL nodes using OpenMP, I tried using the compiler flag -ta=multi_core but it basically serialized all the loops (as per -acc info). Is the only way to use a preprocessor directive for all the loops? Are there any other more efficient or cleaner ways?
#ifndef _OPENACC
#pragma omp .....
#else
#pragma acc ......
#endif