The docs (see also this) for autocast in PyTorch only discuss training. Does it speed things up if I also use autocast for inference?
Asked
Active
Viewed 2,488 times
1 Answers
3
Yes it could (may not in some cases though).
You are processing data with lower precision (e.g. float16
vs float32
).
Your program has to read and process less data in this case.
This might help with cache locality and hardware specific software (e.g. tensor cores if using CUDA)

Szymon Maszke
- 22,747
- 4
- 43
- 83
-
In addition for a 2080TI isn't it also using faster tensor core units which would otherwise be unused? – Lars Ericson Sep 08 '20 at 02:44