I am trying to use the Tensorflow C API to run a session with the Deeplab graph. The frozen graph of Deeplab, pre-trained on Cityscapes, was downloaded from here: http://download.tensorflow.org/models/deeplabv3_mnv2_cityscapes_train_2018_02_05.tar.gz
When I run with python, I get this segmentation output:
By printing out all of the graph's tensors via the python line: tensors = [n.values() for n in tf.get_default_graph().get_operations()]
, I found out that the dimensions of the input tensor are {1,?,?,3}, and the output tensor are {1,?,?}, and the data types of the input and output tensors are uint8 and int64, respectively. I used this information to write a C++ method to run the graph session:
int Deeplab::run_segmentation(image_t* img, segmap_t* seg) {
using namespace std;
// Allocate the input tensor
TF_Tensor* const input = TF_NewTensor(TF_UINT8, img->dims, 4, img->data_ptr, img->bytes, &free_tensor, NULL);
TF_Operation* oper_in = TF_GraphOperationByName(graph, "ImageTensor");
const TF_Output oper_in_ = {oper_in, 0};
// Allocate the output tensor
TF_Tensor* output = TF_NewTensor(TF_INT64, seg->dims, 3, seg->data_ptr, seg->bytes, &free_tensor, NULL);
TF_Operation* oper_out = TF_GraphOperationByName(graph, "SemanticPredictions");
const TF_Output oper_out_ = {oper_out, 0};
// Run the session on the input tensor
TF_SessionRun(session, nullptr, &oper_in_, &input, 1, &oper_out_, &output, 1, nullptr, 0, nullptr, status);
return TF_GetCode(status); // https://github.com/tensorflow/tensorflow/blob/master/tensorflow/c/tf_status.h#L42
}
Where the argument types image_t
and segmap_t
contain the parameters needed to call TF_NewTensor. They simply hold the pointers to the allocated buffer for the input/output tensors, the dimensions of the tensors, and the size in bytes:
typedef struct segmap {
const int64_t* dims;
size_t bytes;
int64_t* data_ptr;
} segmap_t;
typedef struct image {
const int64_t* dims;
size_t bytes;
uint8_t* data_ptr;
} image_t;
Then, I used OpenCV to fill an array with the street scene image (same one as above), and passed the image_t
and segmap_t
structs into the session run method :
// Allocate input image object
const int64_t dims_in[4] = {1, new_size.width, new_size.height, 3};
image_t* img_in = (image_t*)malloc(sizeof(image_t));
img_in->dims = &dims_in[0];
//img_in->data_ptr = (uint8_t*)malloc(new_size.width*new_size.height*3);
img_in->data_ptr = resized_image.data;
img_in->bytes = new_size.width*new_size.height*3*sizeof(uint8_t);
// Allocate output segmentation map object
const int64_t dims_out[3] = {1, new_size.width, new_size.height};
segmap_t* seg_out = (segmap_t*)malloc(sizeof(segmap_t));
seg_out->dims = &dims_out[0];
seg_out->data_ptr = (int64_t*)calloc(new_size.width*new_size.height, sizeof(int64_t));
seg_out->bytes = new_size.width*new_size.height*sizeof(int64_t);
But the resulting tensor (set_out->data_ptr
) consisted of all 0s. The graph seemed to execute for about 5 seconds, the same amount of time as the working python implementation. Somehow, the graph is failing to dump the output tensor data in the buffer I allocated. What am I doing wrong?