Why the scale became zero when using torch.cuda.amp.GradScaler?

Asked Jul 26 '20 at 10:28

Active Jul 26 '20 at 12:13

Viewed 1,407 times

I use the following snippet of code to show the scale when using Pytorch's Automatic Mixed Precision Package(amp):

scaler = torch.cuda.amp.GradScaler(init_scale = 65536.0,growth_interval=1)
print(scaler.get_scale())

and This is the output that I get:

...
65536.0
32768.0
16384.0
8192.0
4096.0
...
1e-xxx
...
0
0
0

And all the loss after this step became Nan (the scale still is 0 in the meanwhile).
Whats wrong with my loss function or training data?

edited Jul 26 '20 at 12:13

Hossein

asked Jul 26 '20 at 10:28

cuistiano

Please be more clear and explain more. I edited your question as best as I could, but the intention was not really that clear. please kindly edit your question and add necessary informations such as what you have written so far, and what you are after exactly. – Hossein Jul 26 '20 at 12:14
See: https://pytorch.org/docs/stable/notes/amp_examples.html#typical-mixed-precision-training – Lars Ericson Sep 07 '20 at 18:16

0 Answers0