I have a problem using the nvcc compiler. I found out that host code compiled using nvcc 4.2 runs about 5 times slower than the same code compiled using g++ 4.4.6. I am using the NVIDIA SDK Makefile template to compile the code in release configuration. In both cases the -O2 optimization is used. How can this be, since the nvcc should pass the host code to the host compiler. Any ideas?
This is my makefile:
# Add source files here
EXECUTABLE := App
verbose=1
# C/C++ source files (compiled with gcc / c++)
CCFILES := \
cmdl.cpp main.cpp
# Cuda source files (compiled with cudacc)
CUFILES_sm_30 := AppCuda.cu AppHost.cpp
# Do not link with CUTIL
OMIT_CUTIL_LIB := 1
################################################################################
# Rules and targets
ROOTDIR=/home/snpsyn/NVIDIA_GPU_Computing_SDK/C/common
include $(ROOTDIR)/../common/common.mk