0

I am running run_t5_mlm_flax.py with 8 GPU but I get this error (it works with only one GPU). NCCL operation ncclAllReduce(send_buffer, recv_buffer, element_count, dtype, reduce_op, comm, gpu_stream) failed: unhandled cuda error Do you have a suggestion?

Antoine23
  • 79
  • 1
  • 5

0 Answers0