1

I'm running cv::cuda::StereoBM, and it works find on a Tesla K80. (Compute capability 3.7.) Precisely the same code, with precisely the same system libraries, it hangs on a Geforce RTX 2080 Ti. (Compute capability 7.5.) I've got other CUDA code working find on the system.

In particular, this is the code that hangs:

cv::Ptr<cv::StereoBM> sbm_ptr = cv::cuda::createStereoBM();
sbm_ptr->compute(gpu_left, gpu_right, gpu_result);
// .. this line of code is never reached.

I've doubled checked OpenCV's cuda::DeviceInfo::isCompatible, and it lists my device as indeed compatible.

I'm wondering how I might go about debugging this.

talonmies
  • 70,661
  • 34
  • 192
  • 269
Zendel
  • 485
  • 3
  • 14
  • You should add what versions of OpenCV and CUDA for such questions, that way it can help people debug too and keeps the question relevant over time. – Mat Mar 07 '19 at 21:01
  • The [source code](https://github.com/opencv/opencv/blob/master/modules/calib3d/src/stereobm.cpp) can help you pinpoint the exact line that hangs. You can try simply copy/pasting the source code and change the variable names appropriately. – Mat Mar 07 '19 at 21:26
  • So, OpenCV 4.0.0 and 4.0.1 fails to build... something to do with NVCC's configuration. The problem occurs on OpenCV 3.4.4 and 3.4.5 (that's all I've tried.) I have to pass std=c++03 to NVCC to get the cuda parts to build. I'm using CUDA 10.0 on both the Tesla K80 machine, and the Gefore RTX 2080 Ti machine. My code really is as simple as the snippet above. I suppose I could delve into the OpenCV code to figure out which line it's hanging on. – Zendel Mar 08 '19 at 18:44
  • This code is as follows: (don't know how to get stack overflow to format this correctly) cv::cuda::GpuMat gpu_left{left}; cv::cuda::GpuMat gpu_right{right}; cv::cuda::GpuMat gpu_result; sbm_ptr->compute(gpu_left, gpu_right, gpu_result); // ... this line of code is never reached gpu_result.download(result); – Zendel Mar 08 '19 at 18:48
  • Well if this is a known bug, I personally do not know it, which is why I'm not answering the question. Like I said, try to replace the `sbm_ptr->compute` call with the full source code of that method (that I linked). That will hopefully help you pinpoint the exact OpenCV call that's making it hang. – Mat Mar 08 '19 at 19:03
  • So the linked code is for the vanilla stereobm. That code works. It fails when I run CUDA's stereobm. (Note cv::cuda::createSteremBM().) So I downloaded OpenCV-3.4.4, and put debug lines into `modules/cudastereo/src/stereobm.cpp`; however, none of the printf statements show up. I suppose I need to figure out the true path this code is taking (???) But is that the information you want? Where it fails in `modules/cudastereo/src/stereobm.cpp::compute(...)`? – Zendel Mar 08 '19 at 19:55
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/189700/discussion-between-mat-and-zendel). – Mat Mar 08 '19 at 19:56
  • So I instrumented the file `/modules/cudastereo/src/stereobm.cpp`, and it freezes executing `stereoBM_CUDA( le_for_bm, ri_for_bm, disparity, ndisp_, winSize_, minSSD_, stream);` – Zendel Mar 09 '19 at 02:32
  • It freezes on ` callers[winsz2](left, right, disp, maxdisp, stream);` in `stereobm.cu`. I'm not sure I can go much further with this, knowing next to nothing about CUDA. – Zendel Mar 09 '19 at 02:38
  • There's a call to `cudaDeviceSynchronize()` in `template void kernel_caller`. That's where it hangs. – Zendel Mar 09 '19 at 02:59
  • Please see [that question's comments](https://stackoverflow.com/questions/25979764/cuda-hangs-on-cudadevicesynchronize-randomly) for more help. – Mat Mar 09 '19 at 15:58
  • Please *Edit* your question so that people can see the progress that's been made without having to read all the comments. – Mat Mar 10 '19 at 02:32
  • The discussion relates to an intermittent bug. I got something 100% reproducible. I've made a minimal example (main.cpp, build script, build script for compiling opencv) that reproduces the bug. The code works fine on an AWS p2 instance (Tesla K80 compute capability 3.7.), but fails on an AWS g3 instance (). It also fails on an AWS p3 instance. It also hangs on my home computer (Geforce RTX 2080 Ti, compute capability 7.5) You can download the zip file here: https://pageofswords.net/cudastereobm-bug_minimal-example.zip – Zendel Mar 10 '19 at 02:37
  • Sorry I made a mistake above. It DOES work on the AWS g3 (Tesla M60, compute 5.2). It hangs on AWS p3 (Tesla V100-SXM2-16GB, compute 7.0). I just checked this again to be sure. – Zendel Mar 10 '19 at 02:57

1 Answers1

0

I faced a similar hang issue while running StereoBM on Compute capability 7.2. It was working fine on 6.2. I checked OpenCV's GitHub for issues, there seemed to be a racing issue with OpenCV before 3.4.6 version.

You can find the fix in this thread :

https://github.com/opencv/opencv/pull/13850

I added the patch to my existing code. It worked without any hiccups. Hope it helps.

Madhu Soodhan
  • 160
  • 1
  • 10