0

I'm working on a hand tracking script that utilizes MediaPipe for hand landmark detection and gesture recognition. I want to optimize the script to run on my GPU for faster performance. However, I'm encountering a couple of issues.

Checking GPU Allocation: When I run the code snippet import cv2.cuda followed by creating a cv2.cuda_GpuMat(), and checking if the image is allocated on the GPU using isContinuous(), I get False as the output. This indicates that the image is not allocated on the GPU. I have verified that my GPU supports CUDA and have installed the necessary CUDA drivers.

Here's the output of running nvidia-smi in my terminal:

Sun Jul 16 09:46:38 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 536.40                 Driver Version: 536.40       CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 2060      WDDM  | 00000000:2B:00.0  On |                  N/A |
|  0%   49C    P8              10W / 170W |    785MiB /  6144MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

TensorFlow Lite XNNPACK Delegate: Additionally, I'm seeing the message "INFO: Created TensorFlow Lite XNNPACK delegate for CPU." in my console output. I want to ensure that the script is utilizing my GPU for acceleration instead of falling back to CPU execution.

Here is the folder structure of my project:

- README.md
- main.py
- models/
  - gesture_recognizer.task
  - hand_landmark_full.tflite
  - hand_landmarker.task
  - palm_detection_full.tflite

main.py script (Password:NDRmVaqqAd)

The main.py script contains the hand tracking implementation. I have already set the use_gpu flag to True in the mp_hands.Hands initialization.

I also ran the code snippet import cv2 followed by cv2.cuda.getCudaEnabledDeviceCount() and it outputs 0, indicating that CUDA is not available. Here's the code and the output:

import cv2

if cv2.cuda.getCudaEnabledDeviceCount() > 0:
    print("CUDA is available!")
else:
    print("CUDA is not available.")

Output

CUDA is not available.

I'm working inside a virtual environment, here's the list of packages and versions installed:

absl-py==1.4.0
asttokens==2.2.1
attrs==23.1.0
backcall==0.2.0
cffi==1.15.1
colorama==0.4.6
comm==0.1.3
contourpy==1.1.0
cycler==0.11.0
debugpy==1.6.7
decorator==5.1.1
EasyProcess==1.1
entrypoint2==1.1
executing==1.2.0
flatbuffers==23.5.26
fonttools==4.41.0
ipykernel==6.23.1
ipython==8.14.0
jedi==0.18.2
jupyter_client==8.2.0
jupyter_core==5.3.0
kiwisolver==1.4.4
matplotlib==3.7.2
matplotlib-inline==0.1.6
mediapipe==0.10.2
MouseInfo==0.1.3
mss==9.0.1
nest-asyncio==1.5.6
numpy==1.25.1
opencv-contrib-python==4.8.0.74
opencv-python==4.8.0.74
packaging==23.1
parso==0.8.3
pickleshare==0.7.5
Pillow==10.0.0
platformdirs==3.5.1
prompt-toolkit==3.0.38
protobuf==3.20.3
psutil==5.9.5
pure-eval==0.2.2
PyAutoGUI==0.9.54
pycparser==2.21
PyGetWindow==0.0.9
Pygments==2.15.1
PyMsgBox==1.0.9
pyparsing==3.0.9
pyperclip==1.8.2
PyRect==0.2.0
pyscreenshot==3.1
PyScreeze==0.1.29
python-dateutil==2.8.2
pytweening==1.0.7
pywin32==306
pyzmq==25.1.0
six==1.16.0
sounddevice==0.4.6
stack-data==0.6.2
tornado==6.3.2
traitlets==5.9.0
wcwidth==0.2.6

I would appreciate any guidance or suggestions on how to correctly configure and run the script on my GPU for improved performance. What steps can I take to ensure the GPU acceleration is properly utilized? Is there anything specific I need to do with TensorFlow Lite to enable GPU acceleration?

Thank you for your help!

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36
DuNeemo
  • 23
  • 4
  • You have to build opencv from source with cuda support. Also you need to build MediaPipe with TensorFlow GPU with specific flags. – Hihikomori Jul 16 '23 at 11:19
  • here are wheels of opencv with cuda support: https://github.com/cudawarped/opencv-python-cuda-wheels – Christoph Rackwitz Jul 16 '23 at 13:19
  • looks like a pure mediapipe issue. why do you deal with opencv? – Christoph Rackwitz Jul 17 '23 at 12:03
  • @Hihikomori i tried countless times to build OpenCV from source with cuda support, however, when using CMake, it the logs show this "Python (for build): C:/Users/*USER*/anaconda3/python.exe" instead of all the python3 paths that i gave CMake. – DuNeemo Jul 17 '23 at 18:04
  • I agree you don't need CUDA opencv for mediapipe. You need to follow mediapipes manual on how to build it (mediapipe lib) with GPU support. – Hihikomori Jul 26 '23 at 15:44

0 Answers0