Consider the following program:
#include <cuda/api_wrappers.hpp>
namespace kernels {
template <typename T>
__global__ void print_stuff()
{
printf("This is a plain printf() call.\n");
}
} // namespace kernels
int main()
{
auto launch_config { cuda::make_launch_config(2,2) };
cuda::launch(::kernels::print_stuff<int>, launch_config);
cuda::outstanding_error::ensure_none();
}
(it uses the cuda-api-wrappers library).
The program compiles and runs. However, if I run in in a terminal, it prints nothing; while if I run it via nvvp, the console shows me:
This is a plain printf() call.
This is a plain printf() call.
This is a plain printf() call.
This is a plain printf() call.
... as expected (2 blocks x 2 threads = 4 lines).
What is/could be the reason am I not getting the four lines printed on the terminal as well?
Notes:
- I realize the fault may theoretically be with the library, of which I am the author. So "it has to be the library" is a legitimate answer, but you need to explain why it can't be anything else.
- No warnings when compiling with
nvcc -Xcompiler -Wall -Xcompiler -Wextra
. - I use Devuan GNU/Linux 3 (beowulf; equivalent of Debian Buster).
- My hardware: An AMD64 Intel CPU; a GTX 1050 Ti card.
- nVIDIA Driver version: 430.50; CUDA version: 10.1.105 .
cuda-memcheck
does not complain about the program.