0

When I use multi gpu to train on MXNet(CUDA8.0+cudnn7), I firstly initialize parameters on different context, then I perform scatter_nd on different contexts, the first time scatter nd would work perfectly, but when compute for the second gpu card, I got

F1217 23:53:01.012707 2619 stream_gpu-inl.h:62] Check failed: e == cudaSuccess CUDA: an illegal memory access was encountered

José Pereda
  • 44,311
  • 7
  • 104
  • 132
Poodar
  • 145
  • 1
  • 2
  • 7

1 Answers1

0

This will happen if you haven't copied your data over to all contexts. As per github.com/apache/incubator-mxnet/issues/10240, this may be fixed by github.com/apache/incubator-mxnet/pull/10833. Try with 1.3.1 if you're using 1.3.0

Vishaal
  • 735
  • 3
  • 13