0

My cupy and chainer versions are as follows

Chainer: 4.1.0 NumPy: 1.15.4 CuPy: CuPy Version : 4.1.0
CUDA Root : /usr/local/cuda-9.0 CUDA Build Version : 9000 CUDA Driver Version : 9020 CUDA Runtime Version : 9000
cuDNN Build Version : 7104 cuDNN Version : 7104 NCCL Build Version : 2104

I am trying to run a test script following this link and the script i used is as following

python image_sheeping.py figure_skating/models/resnet_50_augmentation_no_noise_75_100/Resnet50SheepLocalizer_97305.npz \log -i figure_skating/evaluation_dataset/test_images/22.png -g 0 -o validation_data/images/analyzed

Error message I am receiving is

Traceback (most recent call last): File "image_sheeping.py", line 50, in bboxes, scores = localizer.localize(processed_image)[:2] File "/home/rahul/Desktop/Thesis/code/loans/loans/sheep/unsupervised_sheep_localizer.py", line 43, in localize bboxes, rois, scores, visual_backprop = self.model.predict([processed_image], return_visual_backprop=return_visual_backprop) File "/home/rahul/Desktop/Thesis/code/loans/loans/figure_skating/models/resnet_50_augmentation_no_noise_75_100/localizer.py", line 102, in predict rois, bboxes = self(images) File "/home/rahul/Desktop/Thesis/code/loans/loans/figure_skating/models/resnet_50_augmentation_no_noise_75_100/localizer.py", line 144, in call h = self.feature_extractor(input_images, layers=['res5', 'pool5']) File "/home/rahul/.virtualenvs/loans/lib/python3.6/site-packages/chainer/links/model/vision/resnet.py", line 198, in call h = func(h) File "/home/rahul/.virtualenvs/loans/lib/python3.6/site-packages/chainer/links/connection/convolution_2d.py", line 175, in call groups=self.groups) File "/home/rahul/.virtualenvs/loans/lib/python3.6/site-packages/chainer/functions/connection/convolution_2d.py", line 582, in convolution_2d y, = fnode.apply(args) File "/home/rahul/.virtualenvs/loans/lib/python3.6/site-packages/chainer/function_node.py", line 258, in apply outputs = self.forward(in_data) File "/home/rahul/.virtualenvs/loans/lib/python3.6/site-packages/chainer/function_node.py", line 367, in forward return self.forward_gpu(inputs) File "/home/rahul/.virtualenvs/loans/lib/python3.6/site-packages/chainer/functions/connection/convolution_2d.py", line 161, in forward_gpu return self._forward_cudnn(x, W, b, y) File "/home/rahul/.virtualenvs/loans/lib/python3.6/site-packages/chainer/functions/connection/convolution_2d.py", line 234, in _forward_cudnn auto_tune=auto_tune, tensor_core=tensor_core) File "cupy/cudnn.pyx", line 598, in cupy.cudnn.convolution_forward File "cupy/cudnn.pyx", line 33, in cupy.cudnn.get_handle File "cupy/cuda/cudnn.pyx", line 473, in cupy.cuda.cudnn.create File "cupy/cuda/cudnn.pyx", line 446, in cupy.cuda.cudnn.check_status cupy.cuda.cudnn.CuDNNError: CUDNN_STATUS_INTERNAL_ERROR Exception ignored in: del of 0%| | 0/1 [00:17 Traceback (most recent call last): File "/home/rahul/.virtualenvs/loans/lib/python3.6/site-packages/tqdm/_tqdm.py", line 931, in del self.close() File "/home/rahul/.virtualenvs/loans/lib/python3.6/site-packages/tqdm/_tqdm.py", line 1133, in close self._decr_instances(self) File "/home/rahul/.virtualenvs/loans/lib/python3.6/site-packages/tqdm/_tqdm.py", line 496, in _decr_instances cls.monitor.exit() File "/home/rahul/.virtualenvs/loans/lib/python3.6/site-packages/tqdm/_monitor.py", line 52, in exit self.join() File "/usr/lib/python3.6/threading.py", line 1053, in join raise RuntimeError("cannot join current thread") RuntimeError: cannot join current thread

Can anyone help me to solve the error ?

talonmies
  • 70,661
  • 34
  • 192
  • 269
Rahul
  • 41
  • 6
  • One possible recommendation for a cuDNN library internal error like this is to file a bug with NVIDIA at developer.nvidia.com with complete instructions for demonstrating the issue. – Robert Crovella Dec 11 '18 at 14:06
  • Could you try again with CuPy installed using `pip install cupy-cuda90` instead of `pip install cupy`? – kmaehashi Dec 12 '18 at 04:37
  • @KenichiMaehashi ....... I have tried both its showing the same error – Rahul Dec 12 '18 at 11:30

1 Answers1

1

Could you rerun the code with the following environment variable set?

export CUDNN_LOGDEST_DBG=cudnn_debug.log
export CUDNN_LOGINFO_DBG=1

Then please share the cudnn_debug.log file (e.g. using Gist).

kmaehashi
  • 879
  • 6
  • 10