How do I programatically find the maximum number of concurrent cuda threads or streaming multiprocessors on a device / nvidia graphics card? I know about warpSize
, but there is no warpCount
.
most answers on the internet concern themselves with looking up things from pdfs.