Note the sentence
CUDA applications built using CUDA Toolkit versions 2.1 through 10.2 are compatible with NVIDIA Ampere architecture based GPUs as long as they are built to include PTX versions
(emphasis mine)
Plus the explanation in the section above.
When a CUDA application launches a kernel on a GPU, the CUDA Runtime determines the compute capability of the GPU in the system and uses this information to find the best matching cubin or PTX version of the kernel. If a cubin compatible with that GPU is present in the binary, the cubin is used as-is for execution. Otherwise, the CUDA Runtime first generates compatible cubin by JIT-compiling 1 the PTX and then the cubin is used for the execution. If neither compatible cubin nor PTX is available, kernel launch results in a failure.
In effect: The CUDA toolkit remains ABI-compatible between 2.1 and 11. Therefore an application built for an old version will continue to load at runtime. The CUDA runtime will then detect that your kernels are built for a version that is not compatible with Ampere. So it will take the PTX and compile a new version at runtime.
As note in comments, only a current driver is required on the production system for this to work.