How to use OpenACC on 2D subvector in C++ or OpenCV SubMatrix?

Question

I have the following code

int main(int argc, char** argv )
{
    std::cout<<"running Lenna..\n";
    cv::Mat mat = imread("lena.bmp", cv::IMREAD_GRAYSCALE );

    //convert to vec
    std::vector<double> BWvec;
    BWvec.assign((double*)mat.data, (double*)mat.data + mat.total());
    std::vector < std::vector<double>> vec2D;
    for (int i = 0; i < mat.rows; i++) {
        auto first = BWvec.begin() + (mat.rows * i);
        auto last = BWvec.begin() + (mat.rows * i) + mat.rows;
        std::vector<double> vec0(first, last);
        vec2D.push_back(vec0);
    }

    //#pragma acc parallel loop
    for (int i = 0; i <= 5; i++) {
        for (int j = 0; j <= 5; j++) {
            mat(cv::Rect(i,j, (4 - 0), (4 - 0)));

            //sub-vector[5:10][25:100]:
            std::vector<std::vector<double>> sub_vector;
            sub_vector.reserve(5);
            for (std::size_t k = 5; k < 10; ++k) {
                sub_vector.emplace_back(vec2D[k+i].begin() + 25, vec2D[k+i].begin() + 100);
            }
        }
    }

    return 0;
}

When I type pgc++ -fast -ta=nvidia:cuda9.2,managed -Minfo=accel -o lenna lenna.cpp -std=c++11 pkg-config --cflags --libs opencv -lgomp && ./lenna, it works fine serially, but when I uncomment the #pragma acc parallel loop, I get the error

procedures called in a compute region must have acc routine information
accelerator region ignored, accelerator restriction .. no acc routine information

I get this error also if I comment out the mat(cv::Rect(i,j,(4-0),(4-0))) and leave the part after sub-vector[5:10][25:100], or if I uncomment the mat(cv::Rect(i,j,(4-0),(4-0))) and comment the part after sub-vector[5:10][25:100]

how can I fix this?

EDIT

To make this simpler, I provided 2 separate codes, and the errors they give:

lenna1.cpp:

#include <stdio.h>
#include <cmath>
#include <omp.h>
#include <opencv2/opencv.hpp>

using namespace std;
using namespace cv;

//pgc++ -fast -ta=nvidia:cuda9.2,managed -Minfo=accel -o lenna lenna.cpp -std=c++11 `pkg-config --cflags --libs opencv` -lgomp && ./lenna

int main(int argc, char** argv )
{
    std::cout<<"running Lenna..\n";
    cv::Mat mat = imread("lena.bmp", cv::IMREAD_GRAYSCALE );

    #pragma acc parallel loop
    for (int i = 0; i <= 5; i++) {
        for (int j = 0; j <= 5; j++) {
            mat(cv::Rect(i, j, (4 - 0), (4 - 0)));
        }
    }
    return 0;
}

error from lenna1.cpp:

pgc++ -fast -ta=nvidia:cuda9.2,managed -Minfo=accel -o lenna1 lenna1.cpp -std=c++11 `pkg-config --cflags --libs opencv` -lgomp && ./lenna1
lenna1.cpp:
"lenna1.cpp", line 23: warning: last line of file ends without a newline
  }
   ^

PGCC-S-0155-Procedures called in a compute region must have acc routine information: cv::Mat::Mat(const cv::Mat&, const cv::Rect_<int> &) (lenna1.cpp: 379)
PGCC-S-0155-Accelerator region ignored; see -Minfo messages  (lenna1.cpp: 14)
main:
     14, Accelerator region ignored
         379, Accelerator restriction: call to 'cv::Mat::Mat(const cv::Mat&, const cv::Rect_<int> &)' with no acc routine information
PGCC/x86-64 Linux 19.10-0: compilation completed with severe errors

lenna2.cpp:

#include <stdio.h>
#include <cmath>
#include <omp.h>
#include <opencv2/opencv.hpp>

using namespace std;
using namespace cv;

//pgc++ -fast -ta=nvidia:cuda9.2,managed -Minfo=accel -o lenna lenna.cpp -std=c++11 `pkg-config --cflags --libs opencv` -lgomp && ./lenna

int main(int argc, char** argv )
{
    std::cout<<"running Lenna..\n";
    cv::Mat mat = imread("lena.bmp", cv::IMREAD_GRAYSCALE );

    //convert to vec
    std::vector<double> BWvec;
    BWvec.assign((double*)mat.data, (double*)mat.data + mat.total());
    std::vector < std::vector<double>> vec2D;
    for (int i = 0; i < mat.rows; i++) {
        auto first = BWvec.begin() + (mat.rows * i);
        auto last = BWvec.begin() + (mat.rows * i) + mat.rows;
        std::vector<double> vec0(first, last);
        vec2D.push_back(vec0);
    }

    #pragma acc parallel loop
    for (int i = 0; i <= 5; i++) {
        for (int j = 0; j <= 5; j++) {
            //sub-vector[5:10][25:100]:
            std::vector<std::vector<double>> sub_vector;
            sub_vector.reserve(5);
            for (std::size_t i = 5; i < 10; ++i) {
                sub_vector.emplace_back(vec2D[i].begin() + 25, vec2D[i].begin() + 100);
            }
        }
    }

    return 0;
}

error from lenna2.cpp:

pgc++ -fast -ta=nvidia:cuda9.2,managed -Minfo=accel -o lenna2 lenna2.cpp -std=c++11 `pkg-config --cflags --libs opencv` -lgomp && ./lenna2
lenna2.cpp:
"lenna2.cpp", line 40: warning: last line of file ends without a newline
  }
   ^

operator new (unsigned long, void *):
      4, include "opencv.hpp"
          47, include "core.hpp"
               56, include "algorithm"
                    10, include "algorithm"
                         62, include "stl_algo.h"
                              62, include "stl_tempbuf.h"
                                   60, include "stl_construct.h"
                                        59, include "new"
                                            130, Generating implicit acc routine seq
                                                 Generating acc routine seq
                                                 Generating Tesla code
operator delete (void *, void *):
      4, include "opencv.hpp"
          47, include "core.hpp"
               56, include "algorithm"
                    10, include "algorithm"
                         62, include "stl_algo.h"
                              62, include "stl_tempbuf.h"
                                   60, include "stl_construct.h"
                                        59, include "new"
                                            135, Generating implicit acc routine seq
                                                 Generating acc routine seq
                                                 Generating Tesla code
PGCC-S-0155-Procedures called in a compute region must have acc routine information: std::__throw_length_error(const char *) (lenna2.cpp: 69)
PGCC-S-0155-Accelerator region ignored; see -Minfo messages  (lenna2.cpp: 25)
main:
     25, Accelerator region ignored
          69, Accelerator restriction: call to 'std::__throw_length_error(const char *)' with no acc routine information
PGCC/x86-64 Linux 19.10-0: compilation completed with severe errors

Mat Colgrove · Answer 1 · 2020-07-01T19:48:01.613

In order to call routines and methods from the device, there needs to be device versions of these routines. In cases where the definition of the called routine is known (such as with templates), the compiler will attempt to implicitly generate the device routine. Otherwise, it's the programmers responsibility to decorate the called routine with an OpenACC "routine" directive.

Since the information you provided is incomplete, it's difficult to know exactly how to fix your code. What routines do the error message say are missing? Are you able to provide a complete reproducing example?

EDIT after the update.

call to 'cv::Mat::Mat(const cv::Mat&, const cv::Rect_ &)' with no acc routine information

Looks like a constructor for the "Mat" type doesn't have a device callable version. While I'm not familiar with OpenCV's structure, I'm assuming this is not templated nor is the definition for the constructor included in the header, hence the compiler can implicitly create it. You'll need to add routine directives to the portion of OpenCV you wish to call from device code, or if there are CUDA device routines, you might be able to call them instead by using an OpenACC routine directive with a bind clause.

69, Accelerator restriction: call to 'std::__throw_length_error(const char *)' with no acc routine information

Exception handling is not available for device code since it would need to be caught on the host and there currently isn't a way to support this.

In some cases you can work around this by disabling exceptions via the flag "--no_exceptions", but in this case it looks like OpenCV will complain if exceptions are disabled. Hence it's probably best to avoid using a vector on the device here.

I provided more details in the `EDIT` – user5739619 Jul 01 '20 at 17:43 — user5739619, Jul 01 '20 at 17:43

How to use OpenACC on 2D subvector in C++ or OpenCV SubMatrix?

1 Answers1