MXNET vs Theano Experience

Question

Am looking to do some distributed calc. using GPU for machine learning ? Just wondering if anybody has experience with MXNET (perf. vs Theano)

Reference http://www.cs.cmu.edu/~muli/file/mxnet-learning-sys.pdf

Thanks

score 4 · Answer 1 · answered Mar 29 '16 at 21:55

I had a lot of experience with both mxnet and Theano (via lasagne and keras)

Benchmarking is always biased, so I will not comment on that, except to note that all the frameworks are very fast. Here are several things that should help you decide:

Theano compared to mxnet is like assembly compared to python. Theano has low-level primitives to build machine learning models, and on itself does not define any layers or optimizers, and you would usually use it with some deep-learning library, such as Lasagne or Keras, while mxnet is higher level. So fare comparison would be mxnet vs Keras, not mxnet vs Theano.
mxnet is a more recent library, and certain things in it are not as polished yet, and there's way fewer resources online than for Theano.
Theano (and therefore Lasagne and Keras) compile models when they run them for the first time into C++ and Cuda, which is very slow. For a very complex model, such as an unrolled LSTM, it can take good couple minutes to compile. It is usually very little compared to the time it will take for the model to train (hours to weeks), but is very annoying when you prototype.

Overall, if you choose between these two frameworks, I would suggest Theano + Keras for everything except for the recurrent or very deep networks, otherwise the compilation in Theano will be killing you.

Also look into TensorFlow. It is (subjectively) slower than mxnet, but is more mature and has more resources online.

@quantCode, my own experiments also show that TensorFlow is slower. The "has limited features" bit is not accurate. It is unlikely that you will need something for your models that TensorFlow does not offer. — Ishamael, Mar 31 '16 at 03:40
Disagree with 1. `mxnet` does has support for symbolic programming paradigm, which is similar to `Theano`. — Kh40tiK, Nov 22 '16 at 10:33
you can look into CNTK for Recurrence / LSTMs it is Keras style and scales well across multiple GPU on large data sets. — Sayan Pathak, Jan 12 '17 at 04:13

MXNET vs Theano Experience

1 Answers1