1

I am a beginner in the field of Deep Learning. As my models have started taking longer to train using my CPU, I wanted to explore using my NVIDIA Quadro 1200 GPU for these tasks.

My specs are as follows:

  • Windows 10 22H2
  • python 3.11 and 3.10 (in Windows's native environment), python 3.8 (in my WSL)
  • tensorflow 2.12
  • Intel i7-6700 CPU
  • 32GB DDR4 RAM
  • NVIDIA Quadro K1200 GPU (Graphics Driver 528.89)

The code I'm working with is located in this jupyter notebook.

Things I've tried:

  1. I've tried tensorflow-directml-plugin in Windows' native environment as per the documentation here. When trying it out, I get UnimplementedError: Graph execution error. (Cell 20 in the notebook above).
  2. I've also tried WSL as per the documentation here and here. In this route, I opted for Option 1: Installation of Linux x86 CUDA Toolkit using WSL-Ubuntu Package as per NVIDIA's documentation. I followed the following installation commands as per Tensorflow installation page for Windows WSL2:
conda install -c conda-forge cudatoolkit=11.8.0
python3 -m pip install nvidia-cudnn-cu11==8.6.0.163 tensorflow==2.12.*
mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
source $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
# Verify install:
python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

When I execute the last command, I get the following errors:

(condaenv) root@DESKTOP-CHV5AM:/home/admin/Python# python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
2023-06-15 12:52:11.886160: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-06-15 12:52:12.493026: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-06-15 12:52:13.151393: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-06-15 12:52:13.174058: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-06-15 12:52:13.174904: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:982] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Help please.

SOuser
  • 117
  • 1
  • 9
  • Can you please suggest a stack where I may move this question? If this qualifies as a "hardware/software configuration issue", would Super User be an appropriate stack? – SOuser Jun 22 '23 at 12:43
  • 1
    I have the same problem with a different configuration and we are not the only ones (https://stackoverflow.com/questions/76470139/tensorflow-crashes-during-training-what-should-i-do). Anyone? – mixophyes Aug 02 '23 at 23:48

0 Answers0