How to convert a scikit model in a fast `.so`

Question

What should be better path to convert a scikit model (e.g. the result of a RandomForestClassifier fit) in a piece of C++ to get the the fastest .so that can be called from some other ecosystem ?

I don't understand what you mean by "convert". `RandomForestClassifier` is [implemented in Python](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/ensemble/forest.py#L372-L628), not as a compiled extension. If you want a fast compiled version you will need to re-write it. One option would be to use [Cython](http://cython.org/) which can convert a superset of Python to C so that it can be statically compiled. There's no zero-effort solution, though - you will need to add your own static type declarations etc. in order to see any appreciable speed-up. — ali_m, Jul 19 '16 at 00:22
Yes, I agree, but to create/train/fit the model and use it to do a prediction are two different thinks. scikit RandomForestClassifier will create the model, usually stored in a pickle as tree collection. In a High Speed RTB context I need to "use/accelerate" this resulting model, converting only the final "Tree soup" in some C++ code to apply it faster. I've seen PMML http://stackoverflow.com/questions/38431113/convert-a-pmml-describe-model-in-c-c but does not seem to help much for our usecase. — user3313834, Jul 19 '16 at 10:45

score 2 · Accepted Answer · answered Dec 17 '16 at 17:49

For portability of trained scikit learn models to other languages, see the sklearn-porter project.

Though, whether this will be faster than the originalRandomForestClassifier.predict method (which is multithreaded and uses numpy operations, potentially with a fast BLAS library) remains to be seen.

How to convert a scikit model in a fast `.so`

1 Answers1