0

I trained a model using mxnet framework. The inference time for the model is ~ 9 milliseconds. The model mainly consists of conv layers and uses depthwise separable convolution.

I want to run that model in browser. I converted the model to ONNX format then from

ONNX -> tensorflow -> tensorflowjs.

The inference time for tensorflowjs model ~129 milliseconds.

Any suggestion to improve the performance for the model?

I have also tried ONNXJS but it seems it still has few bugs.

Soubhi M. Hadri
  • 133
  • 1
  • 15
  • It seems while converting the model from mxnet to ONNX -> Tensorflow, the num_group parameter for conv layers in mxnet was not considered correctly. I ran the tensorflow model and it took ~90 milliseconds. I will try to build and train the model using Keras and check if there will be any enhancement. – Soubhi M. Hadri Mar 06 '19 at 15:02

1 Answers1

1

Re-architecting would be a possibility since you're dealing with 129ms latency. You would have time to send images to an endpoint (EC2, or SageMaker + API Gateway) running a performant inference server.

Vishaal

Vishaal
  • 735
  • 3
  • 13