I would like to know how to obtain the total number of CUDA Cores in my GPU using Python, Numba and cudatoolkit.
Asked
Active
Viewed 6,006 times
6
-
1Does this answer your question? [How can I get number of Cores in cuda device?](https://stackoverflow.com/questions/32530604/how-can-i-get-number-of-cores-in-cuda-device) – MichaelJanz Sep 10 '20 at 06:12
-
possible duplicate at https://stackoverflow.com/questions/32530604/how-can-i-get-number-of-cores-in-cuda-device – MichaelJanz Sep 10 '20 at 06:12
-
@MichaelJanz unfortunately, I have a very specific requirement of using Python and Numba. The possible duplicate that you have suggested solves the problem if I were using C. It is a nice source of information that you have provided but not an answer to my question. In short, my quesiton is not a duplicate and it specifically asks for a Python and Numba based solution. – codeonion Sep 10 '20 at 20:44
1 Answers
16
Most of what you need can be found by combining the information in this answer along with the information in this answer.
We'll use the first answer to indicate how to get the device compute capability and also the number of streaming multiprocessors. We'll use the second answer (converted to python) to use the compute capability to get the "core" count per SM, then multiply that by the number of SMs.
Here is a full example:
$ cat t36.py
from numba import cuda
cc_cores_per_SM_dict = {
(2,0) : 32,
(2,1) : 48,
(3,0) : 192,
(3,5) : 192,
(3,7) : 192,
(5,0) : 128,
(5,2) : 128,
(6,0) : 64,
(6,1) : 128,
(7,0) : 64,
(7,5) : 64,
(8,0) : 64,
(8,6) : 128,
(8,9) : 128,
(9,0) : 128
}
# the above dictionary should result in a value of "None" if a cc match
# is not found. The dictionary needs to be extended as new devices become
# available, and currently does not account for all Jetson devices
device = cuda.get_current_device()
my_sms = getattr(device, 'MULTIPROCESSOR_COUNT')
my_cc = device.compute_capability
cores_per_sm = cc_cores_per_SM_dict.get(my_cc)
total_cores = cores_per_sm*my_sms
print("GPU compute capability: " , my_cc)
print("GPU total number of SMs: " , my_sms)
print("total cores: " , total_cores)
$ python t36.py
GPU compute capability: (5, 2)
GPU total number of SMs: 8
total cores: 1024
$

Robert Crovella
- 143,785
- 11
- 213
- 257
-
This was exactly what I was looking for. The output of this script matches the output of: $ nvidia-settings -q CUDACores -t – codeonion Sep 10 '20 at 20:47
-
`'COMPUTE_CAPABILITY'` is now split between mayor and minor properties: ```my_cc = (device.COMPUTE_CAPABILITY_MAJOR, device.COMPUTE_CAPABILITY_MINOR)``` – Ilya Orson Aug 25 '21 at 01:42
-
compute capability has always consisted of major and minor parts. In numba, they are two parts of a tuple. Retrieval as a tuple is [supported](https://numba.readthedocs.io/en/stable/cuda-reference/host.html#numba.cuda.cudadrv.driver.Device). – Robert Crovella Aug 25 '21 at 02:00