Training deep learning models on Azure ML is extremely slow

Question

Training a classification model on one V100 GPU (Standard_NC6s_v3) on Azure ML take couples of days which is extremely slow. Has anybody experienced that before?

I suspect that the data loading might be the bottleneck. However, I tried to copy the data from Azure blob storage into compute cluster docker container. However, it didn't change anything.

score 0 · Answer 1 · answered Nov 16 '22 at 10:55

0

When using deep learning models, we need to create the compute instance with GPU. STANDARD_NV24 is suggestable.

answered Nov 16 '22 at 10:55

Sairam Tadepalli

1,563
1
3
11

I am using Standard_NC6s_v3 which has one V100 GPU and has higher compute copability than K80. – Sina Nov 16 '22 at 14:27
Can you mention the dataset size which is being used and the splitting distribution. – Sairam Tadepalli Nov 17 '22 at 03:40
Same training takes 1/3 of the time on a local server – Sina Nov 17 '22 at 14:38

Training deep learning models on Azure ML is extremely slow

1 Answers1