Tensorflow 1.6.0 with CUDA Support on CentOS 6.10 C++ linking against libtensorflow(_cc)(_framework).os

Question

So I have sucessfully had the CPU version only working with my software.

Now I have installed a new machine with CUDA hardware which I have working under Windows using contrib cmake.

The Linux build uses bazel and builds the required targets

//tensorflow:libtensorflow.so
//tensorflow:libtensorflow_cc.so
//tensorflow:libtensorflow_framework.so

Using the following .zip file: https://github.com/tensorflow/tensorflow/archive/v1.6.0.zip

with CUDA 9.1 cudnn 7.1.2 and gcc 4.8.5 on CentOS 6.10

built with bazel

But when I go ahead to build my software I get the following error:

  CXX      Linux-64-debug/rotobot.o
  In file included from /home/sam/dev/tensorflow-1.6.0/tensorflow/core/framework/tensor.h:23:0,
                 from /home/sam/dev/tensorflow-1.6.0/tensorflow/core/public/session.h:24,
                 from rotobot.cpp:32:
/home/sam/dev/tensorflow-1.6.0/tensorflow/core/framework/types.h: In instantiation of ‘struct tensorflow::DataTypeToEnum<long int>’:
/home/sam/dev/tensorflow-1.6.0/tensorflow/core/framework/tensor.h:566:46:   required from ‘typename tensorflow::TTypes<T, NDIMS>::Tensor tensorflow::Tensor::tensor() [with T = long int; long unsigned int NDIMS = 3ul; typename tensorflow::TTypes<T, NDIMS>::Tensor = Eigen::TensorMap<Eigen::Tensor<long int, 3, 1, long int>, 16, Eigen::MakePointer>]’
rotobot.cpp:1742:53:   required from here
/home/sam/dev/tensorflow-1.6.0/tensorflow/core/framework/types.h:356:3: error: static assertion failed: Specified Data Type not supported
   static_assert(IsValidDataType<T>::value, "Specified Data Type not supported");
   ^
In file included from /home/sam/dev/tensorflow-1.6.0/tensorflow/core/public/session.h:24:0,
                 from rotobot.cpp:32:
/home/sam/dev/tensorflow-1.6.0/tensorflow/core/framework/tensor.h:
In instantiation of ‘typename tensorflow::TTypes<T, NDIMS>::Tensor tensorflow::Tensor::tensor() [with T = long int; long unsigned int NDIMS = 3ul; typename tensorflow::TTypes<T, NDIMS>::Tensor = Eigen::TensorMap<Eigen::Tensor<long int, 3, 1, long int>, 16, Eigen::MakePointer>]’:
rotobot.cpp:1742:53:   required from here
/home/sam/dev/tensorflow-1.6.0/tensorflow/core/framework/tensor.h:566:46: error: ‘v’ is not a member of ‘tensorflow::DataTypeToEnum<long int>’
   CheckTypeAndIsAligned(DataTypeToEnum<T>::v());
                                              ^
make: *** [Linux-64-debug/rotobot.o] Error 1

Any ideas?

the following are lines 31-32

#include <tensorflow/core/platform/init_main.h>
#include <tensorflow/core/public/session.h>
#include <tensorflow/core/framework/tensor_shape.h>

the word long doesnt appear in the source code at all.

It was all working previously.

The previous build environment is a virtual machine CentOS 6.9 but pretty much identical.

that product can be found here https://kognat.com/shop

Edit:

I can see the question is a little opaque:

here is the command being run

g++ -c -std=c++11 -g -L/home/sam/opt/standalone_tf/lib -ltensorflow_cc -L/home/sam/opt/standalone_tf/lib -ltensorflow_framework -I/home/sam/opt/standalone_tf/include/third_party  -I/home/sam/opt/standalone_tf/include -I/home/sam/opt/standalone_tf/include/nsync/public/ -I/home/sam/opt/eigen3/include/eigen3 -I/home/sam/opt/ilmbase-2.2.0/include -I/home/sam/opt/oiio-1.6.18/include/ -I/home/sam/dev/RLM/src -L/home/samh/dev/RLM/x64_l1 -lrlm   -I../..//../include -I../..//include -I../..//Plugins/include -m64 -fPIC -fvisibility=hidden     rotobot.cpp -o Linux-64-debug/rotobot.o

the includes of interest are: -I/home/sam/opt/standalone_tf/include/third_party -I/home/sam/opt/standalone_tf/include -I/home/sam/opt/standalone_tf/include/nsync/public/ -I/home/sam/opt/eigen3/include/eigen3

where the contents of /home/sam/opt/standalone_tf

mimick the directory /home/standalone in the tutorial

https://tuanphuc.github.io/standalone-tensorflow-cpp/

The example in the tutorial above builds and runs fine.

see:

[sam@localhost Rotobot]$ cd ~/opt/standalone_tf/
[sam@localhost standalone_tf]$ make clean
rm -f main
[sam@localhost standalone_tf]$ make
g++ -std=c++11 -g -Wall -D_DEBUG -Wshadow -Wno-sign-compare -w -o main main.cc -I/usr/local/include/eigen3 -I./include/third_party -I./include -I./include/nsync/public/ -lprotobuf -pthread -lpthread -L/home/sam/opt/standalone_tf//lib/ -Wl,-R/home/sam/opt/standalone_tf//lib/ '-Wl,-R$ORIGIN' -ltensorflow_cc -ltensorflow_framework -lrt
[sam@localhost standalone_tf]$ make run
./main --image=./data/grace_hopper.jpg --graph=./data/inception_v3_2016_08_28_frozen.pb --labels=./data/imagenet_slim_labels.txt
2019-01-05 07:26:49.172451: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2
2019-01-05 07:26:49.804743: I main.cc:250] military uniform (653): 0.834307
2019-01-05 07:26:49.804781: I main.cc:250] mortarboard (668): 0.0218693
2019-01-05 07:26:49.804791: I main.cc:250] academic gown (401): 0.010358
2019-01-05 07:26:49.804799: I main.cc:250] pickelhaube (716): 0.00800808
2019-01-05 07:26:49.804807: I main.cc:250] bulletproof vest (466): 0.00535084

I seem to have lost some of the config about the contents of the tensorflow header files when transfering the build system from the old box in a VM which I still have to the new CUDA HW solution.

For now I will try building up the example in the sandbox in hope that it transfers some new knowledge over to the existing package for deployment.

Just to let you know this is the prototype for a deployable package and the reason I am spending time on CentOS 6.x is that I need the product to be compatible with glibc 2.12, if there is a shorter route to this destintation let me know.

I will try now without CUDA support and see if the result is any different. — Sam Hodge, Jan 04 '19 at 18:42
Same problem without CUDA,so that is not the source of the error — Sam Hodge, Jan 04 '19 at 20:43
Looking at https://tuanphuc.github.io/standalone-tensorflow-cpp/ has given me some hints and following the tutorial I can build and run the standalone in the tutorial — Sam Hodge, Jan 04 '19 at 20:43
I did something silly and went to CentOS 7.4 o see if that solved the problem only to open my eyes to the real issue: auto detectionsMap = outputs[0].tensor(); doesnt like it being a int64_t so I guess I need to choose another data type — Sam Hodge, Jan 09 '19 at 02:15

score 0 · Answer 1 · answered Jan 09 '19 at 04:04

This was the cause

auto detectionsMap = outputs[0].tensor<int64_t, 3>();

whereas this worked

auto detectionsMap = outputs[0].tensor<int64, 3>();

It was made that way because that is what tensorboard told me, it is funny that it worked OK in a VM but not on real HW.

Weird eh?

Tensorflow 1.6.0 with CUDA Support on CentOS 6.10 C++ linking against libtensorflow(_cc)(_framework).os

1 Answers1