I have my own dataset. I want to do a classification task. But I built same symbol network in Mxnet and Keras. Even the optimizer rules are same. But the results are different.
and results: it looks like ramdon?
But my keras code are same network:
but the result is much better. In training set, i can be 100%
I still cannot figure out why there are same network architectures and data are same. However, the classification result between two frames is large.
Hope someone could give some suggestion. Thx.