If I write the following CUDA code:
#include <stdio.h>
template <unsigned N>
__global__ void foo()
{
printf("In kernel foo() with N = %u\n", N);
if (N < 10) { return; }
printf("Wow, N is really high!\n");
/* a whole lot of code here which I don't want to indent */
}
int main() {
foo<5><<<1,1>>>();
foo<20><<<1,1>>>();
return 0;
}
I get a compiler warning:
a.cu(8): warning: statement is unreachable
detected during instantiation of "void foo<N>() [with N=5U]"
(12): here
I "feel" I shouldn't be getting this warning, since the unreachable code is only unreachable for certain values of the template parameter. And if I write the "CPU equivalent", so to speak:
#include <cstdio>
template <unsigned N>
void foo()
{
std::printf("In kernel foo() with N = %u\n", N);
if (N < 10) { return; }
std::printf("Wow, N is really high!\n");
/* a whole lot of code here which I don't want to indent */
}
int main() {
foo<5>();
foo<20>();
return 0;
}
and build this with gcc (5.4.0) - I don't get any warnings, even if I compile with -Wall
.
Now, I can circumvent this by writing
if (not (N < 10)) {
printf("Wow, N is really high!\n");
/* a whole lot of code here which I don't want to indent */
}
but I would rather avoid having to reverse my logic to jump through nvcc's "hoop". I could also write
if (not (N < 10)) {
return;
}
else {
printf("Wow, N is really high!\n");
/* a whole lot of code here which I don't want to indent */
}
but - I don't want to indent all that code (and the same problem may occur again, requiring even more indentation inside the else block.
Is there something I could do? Also, isn't this a "bug", or a misfeature I should report as a bug?