How can I accelerate inference speed in TensorFlow when I got sparse matrix from pruning?

Question

I got a sparse weight matrix from Tensorflow-pruning to reduce SqueezeNet. After strip_pruning_vars, I checked the most of elements in weight matrix pruned to 0 successfully. However, the performance of the model didn't increase on what I expected. It seems that additional software library or hardware supporting sparse matrix operations are required. Someone told me that using Intel-MKL library will be helpful, but I don't know how to integrate it with Tensorflow. Now, I have .pb files of SqueezeNet pruned. Any type of help will be highly appreciated.

score 1 · Answer 1 · answered May 16 '19 at 14:01

You can try Intel® Optimization for TensorFlow* Wheel.

It is recommended to use an Intel environment for the same.

Please follow the below steps.

Create a conda environment using the command:

conda create -n my_intel_env -c intel python=3.6
Activate the environment.

source activate my_intel_env
Install the wheel

pip install https://storage.googleapis.com/intel-optimized-tensorflow/tensorflow-1.11.0-cp36-cp36m-linux_x86_64.whl

For more details, you can refer https://software.intel.com/en-us/articles/intel-optimization-for-tensorflow-installation-guide

After installation you can check whether mkl is enabled by following the below commands from the python prompt.

from tensorflow.python.framework import test_util
test_util.IsMklEnabled()

This should return 'True' if mkl is enabled.

Hope this helps.

score 0 · Answer 2 · answered Sep 25 '19 at 03:01

I have met the same problem with you. I used tensorflow to prune a model, but in fact the pruned model did not got a faster prediction speed. In roadmap of tensorflow (https://www.tensorflow.org/model_optimization/guide/roadmap) they say that they will support for sparse model execution in the future. So I guess the reason is tensorflow does not support it so far, so we can only get a sparse model but no speed improvement.

How can I accelerate inference speed in TensorFlow when I got sparse matrix from pruning?

2 Answers2