I tried running my pytorch code but got this error:
A40 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37.
If you want to use the A40 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
Using backend: pytorch
/home/miranda9/miniconda3/envs/metalearningpy1.7.1c10.2/lib/python3.8/site-packages/torch/cuda/__init__.py:104: UserWarning:
A40 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37.
If you want to use the A40 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
Traceback (most recent call last):
File "/home/miranda9/ML4Coq/ml4coq-proj-src/embeddings_zoo/tree_nns/main_brando.py", line 305, in <module>
main_distributed()
File "/home/miranda9/ML4Coq/ml4coq-proj-src/embeddings_zoo/tree_nns/main_brando.py", line 201, in main_distributed
mp.spawn(fn=train, args=(opts,), nprocs=opts.world_size)
File "/home/miranda9/miniconda3/envs/metalearningpy1.7.1c10.2/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/home/miranda9/miniconda3/envs/metalearningpy1.7.1c10.2/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "/home/miranda9/miniconda3/envs/metalearningpy1.7.1c10.2/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/home/miranda9/miniconda3/envs/metalearningpy1.7.1c10.2/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/miranda9/ML4Coq/ml4coq-proj-src/embeddings_zoo/tree_nns/main_brando.py", line 210, in train
setup_process(opts, rank, master_port=opts.master_port, world_size=opts.world_size)
File "/home/miranda9/ultimate-utils/ultimate-utils-proj-src/uutils/torch/distributed.py", line 165, in setup_process
dist.init_process_group(backend, rank=rank, world_size=world_size)
File "/home/miranda9/miniconda3/envs/metalearningpy1.7.1c10.2/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 455, in init_process_group
barrier()
File "/home/miranda9/miniconda3/envs/metalearningpy1.7.1c10.2/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 1960, in barrier
work = _default_pg.barrier()
RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1607369981906/work/torch/lib/c10d/ProcessGroupNCCL.cpp:31, unhandled cuda error, NCCL version 2.7.8
but then it sends me to download it for my mac...? which is weird. What version of pytorch, cuda, cudnn, nccl and other things do I need for a GPU A40?
to see the code I ran and conda env info see this: https://github.com/pytorch/pytorch/issues/58794
related links