1

Tensorflow v2.0.0 segfaults when trying to load a model using the C_API.

The Keras model is absurdly simple:

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(512, activation='relu', input_shape=(28*28,)))
model.add(tf.keras.layers.Dense(10,  activation='softmax'))

model.summary()

# ---- Compile the network
model.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])

The Keras model was saved, using the tensorflow v2.0.0 backend. (I checked this.)

model_fname = '/tmp/test1.h5'
print('Saving model to {}'.format(model_fname))
tf.saved_model.save(network, model_fname);

And then it was loaded in the CAPI:

   const auto export_dir = format("{}", filename);
   vector<const char*> tags;
   tags.resize(1);
   tags[0] = "serve";

   TF_Graph* graph     = TF_NewGraph();
   TF_Session* session = TF_LoadSessionFromSavedModel(opts,
                                                      nullptr,
                                                      export_dir.c_str(),
                                                      &tags[0],
                                                      tags.size(),
                                                      graph,
                                                      nullptr,
                                                      status);

This results in a segfault. Below is the stack trace from ASAN:

2020-01-09 20:21:48.066441: I tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /tmp/test1.h5
2020-01-09 20:21:48.069571: I tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
AddressSanitizer:DEADLYSIGNAL
=================================================================
==36979==ERROR: AddressSanitizer: SEGV on unknown address 0x000002000904 (pc 0x7f33b0591e7d bp 0x7ffee61acb20 sp 0x7ffee61ac960 T0)
==36979==The signal is caused by a READ memory access.
    #0 0x7f33b0591e7c in tensorflow::GPUCompatibleCPUDeviceFactory::CreateDevices(tensorflow::SessionOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::unique_ptr<tensorflow::Device, std::default_delete<tensorflow::Device> >, std::allocator<std::unique_ptr<tensorflow::Device, std::default_delete<tensorflow::Device> > > >*) (/opt/tensorflow/v2.0.0/lib/libtensorflow_framework.so.2+0xfb1e7c)
    #1 0x7f33b05d4f2a in tensorflow::DeviceFactory::AddDevices(tensorflow::SessionOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::unique_ptr<tensorflow::Device, std::default_delete<tensorflow::Device> >, std::allocator<std::unique_ptr<tensorflow::Device, std::default_delete<tensorflow::Device> > > >*) (/opt/tensorflow/v2.0.0/lib/libtensorflow_framework.so.2+0xff4f2a)
    #2 0x7f33b9529172 in tensorflow::DirectSessionFactory::NewSession(tensorflow::SessionOptions const&, tensorflow::Session**) (/opt/tensorflow/v2.0.0/lib/libtensorflow.so.2+0x818e172)
    #3 0x7f33b065159f in tensorflow::NewSession(tensorflow::SessionOptions const&, tensorflow::Session**) (/opt/tensorflow/v2.0.0/lib/libtensorflow_framework.so.2+0x107159f)
    #4 0x7f33afce3df8 in tensorflow::LoadSavedModel(tensorflow::SessionOptions const&, tensorflow::RunOptions const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, tensorflow::SavedModelBundle*) (/opt/tensorflow/v2.0.0/lib/libtensorflow_framework.so.2+0x703df8)
    #5 0x7f33b1d22eb6 in TF_LoadSessionFromSavedModel (/opt/tensorflow/v2.0.0/lib/libtensorflow.so.2+0x987eb6)
    #6 0x6f4812 in zero::test::load_model_test(std::basic_string_view<char, std::char_traits<char> >) src/zero/tf-tests/load-model_test.cpp:216
    #7 0x713ba0 in main src/zero/main.cpp:35
    #8 0x7f33aedd9b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #9 0x417989 in _start (/home/amichaux/Documents/CrashPlan/Development/shinny/zero/build/gcc-10-asan/zero+0x417989)

I'm not sure what to do with this error message. The model loads and executes fine in python3.

charlesreid1
  • 4,360
  • 4
  • 30
  • 52
Zendel
  • 485
  • 3
  • 14
  • Figured out the problem: had tensorflow _before_ any of my C++ code. (This required wrapping all my code into a shared-library, and importing it from with a python process. Embedding a python interpretor didn't work.) Tensorflow had to be loaded first even when my shared library has -fvisibility=hidden! – Zendel Jan 13 '20 at 19:23

0 Answers0