1

I am trying to set the two Canny thresholds by computing some statistics on the gradient magnitude image (which seems like a better thing to do rather than computing thresholds (like Otsu) on the grayscale image as many people seem to do, as these thresholds are related to but wildly different valued from the gradient magnitude image that the thresholds are actually applied to). However, the computed thresholds need to be computed from exactly the same gradient magnitude image that Canny ends up thresholding internally or the result will not be as expected. That is, cv::canny internally does some smoothing (the parameter of which is not exposed), applies the Sobel operators, performs either the fast or complete gradient magnitude calculation, etc., and then applies the user specified thresholds before doing the thinning/linking etc. I would have to do exactly these same steps externally before computing my statistics so that the thresholds I pass to cv::canny actually make sense.

Is there a way to access this image that is being used inside the algorithm?

David Doria
  • 9,873
  • 17
  • 85
  • 147
  • Hi David, how are you? My first thought was to redirect you to [egonSchiele](https://gist.github.com/egonSchiele/756833) implementation, but I saw you already did that ;D. I have a small variant of this one (done a few years ago) as a standalone function (that needs OpenCV, but no need to recompile). So you can put your "_compute my statistics_" code inside this function. Is that what you're looking for? – Miki Mar 20 '16 at 14:22
  • @Miki Yea, I mean I guess I was just looking for a confirmation that "you can't get to this internal image that you want" is the right answer :). I guess a second best would indeed be to use my own Canny function from which I CAN expose this internal image. Do you have your standalone version posted anywhere (recompiling OpenCV isn't a satisfying option)? Can you comment on the fairly common suggestion I see of using a function of the Otsu threshold as the Canny thresholds? This really doesn't seem to be a meaningful thing to do to me. – David Doria Mar 20 '16 at 14:30
  • I'll post the code as an answer to this question. Otsu on the grayscale image doesn't make sense to me, since grayscale image is on intensity values in interval [0,255], while you need the thresholds on gradients (so different semantic value) which values are in range [0, something like 1530] – Miki Mar 20 '16 at 14:38
  • @Miki Ok I'm glad I'm not missing something. Tons of things suggest doing that though... http://stackoverflow.com/a/4653368/284529 http://stackoverflow.com/a/16047590/284529 http://stackoverflow.com/a/24673140/284529 http://ecc.isc.gov.ir/showJournal/4051/48873/654798 and perhaps mainly http://www.pyimagesearch.com/2015/04/06/zero-parameter-automatic-canny-edge-detection-with-python-and-opencv/#comment-392417 (I've started a discussion with the author in the comments). – David Doria Mar 20 '16 at 16:00
  • I was aware of that (see [my comment](http://www.pyimagesearch.com/2015/04/06/zero-parameter-automatic-canny-edge-detection-with-python-and-opencv/#comment-302768) of a year ago... ;D) I didn't investigate further because it was the same as [kerry wong](http://www.kerrywong.com/2009/05/07/canny-edge-detection-auto-thresholding/) (you mentioned this too) which I tested (probably that was my Canny2) but didn't gave good results in my use cases. – Miki Mar 20 '16 at 16:20
  • Your linked SO answers, as well as all these other posts, actually scare me a little now. Either I'm missing something, or everyone is just wrong ;D – Miki Mar 20 '16 at 16:22
  • @Miki Yea, my thoughts exactly :). Also, you mentioned that the gradient magnitude should be in the range "[0, something like 1530]". I am seeing max gradient values up to 13,000 in some cases inside your Canny3? – David Doria Mar 20 '16 at 17:03
  • My bad.. actually [that was for Sobel](http://stackoverflow.com/a/35103045/5008845). Obviously the magnitude can be higher – Miki Mar 20 '16 at 17:08

1 Answers1

2

You cannot directly get the internal state of the OpenCV Canny function, but you can extract the OpenCV code and make your own function.

Here is a function that automatically selects the Canny thresholds (based on egonSchiele implementation).

Note that, in this function:

  • Will output the results of Sobel gradients sobel_x and sobel_y, so you can avoid to recompute it with Sobel in case you want to work with image gradients later. (You can easily refactor this out if not needed)

  • This code always uses the L1 gradient to compute the statistics. Then it uses L1 or L2 according to input arguments for actual magnitude computation.

  • Here the magic numbers are fixed. You can easily refactor the code to pass them as input arguments. These magic numbers are:

    • NUM_BINS: the number of bins of the histogram used to compute the statistics
    • percent_of_pixels_not_edges: to estimate the higher Canny threshold
    • threshold_ratio: to recover the lower Canny threshold.

Regarding the usage of Otsu on the grayscale image to recover the Canny thresholds... Well, it doesn't make much sense to me, since the "grayscale" image and the "gradient magnitude" image have different semantic and values range.


Code:

#include<opencv2/opencv.hpp>
using namespace cv;


// Based on https://gist.github.com/egonSchiele/756833
void cvCanny3(const void* srcarr, void* dstarr,
    void* dxarr, void* dyarr,
    int aperture_size)
{
    cv::AutoBuffer<char> buffer;
    std::vector<uchar*> stack;
    uchar **stack_top = 0, **stack_bottom = 0;

    CvMat srcstub, *src = cvGetMat(srcarr, &srcstub);
    CvMat dststub, *dst = cvGetMat(dstarr, &dststub);

    CvMat dxstub, *dx = cvGetMat(dxarr, &dxstub);
    CvMat dystub, *dy = cvGetMat(dyarr, &dystub);


    CvSize size;
    int flags = aperture_size;
    int low, high;
    int* mag_buf[3];
    uchar* map;
    ptrdiff_t mapstep;
    int maxsize;
    int i, j;
    CvMat mag_row;

    if (CV_MAT_TYPE(src->type) != CV_8UC1 ||
        CV_MAT_TYPE(dst->type) != CV_8UC1 ||
        CV_MAT_TYPE(dx->type) != CV_16SC1 ||
        CV_MAT_TYPE(dy->type) != CV_16SC1)
        CV_Error(CV_StsUnsupportedFormat, "");

    if (!CV_ARE_SIZES_EQ(src, dst))
        CV_Error(CV_StsUnmatchedSizes, "");

    aperture_size &= INT_MAX;
    if ((aperture_size & 1) == 0 || aperture_size < 3 || aperture_size > 7)
        CV_Error(CV_StsBadFlag, "");


    size.width = src->cols;
    size.height = src->rows;

    //aperture_size = -1; //SCHARR
    cvSobel(src, dx, 1, 0, aperture_size);
    cvSobel(src, dy, 0, 1, aperture_size);


    //% Calculate Magnitude of Gradient
    //magGrad = hypot(dx, dy);

    Mat1f magGrad(size.height, size.width, 0.f);
    float maxGrad(0);
    float val(0);
    for (i = 0; i<size.height; ++i)
    {
        float* _pmag = magGrad.ptr<float>(i);
        const short* _dx = (short*)(dx->data.ptr + dx->step*i);
        const short* _dy = (short*)(dy->data.ptr + dy->step*i);
        for (j = 0; j<size.width; ++j)
        {
            val = float(abs(_dx[j]) + abs(_dy[j]));
            _pmag[j] = val;
            maxGrad = (val > maxGrad) ? val : maxGrad;
        }
    }

    //% Normalize for threshold selection
    //normalize(magGrad, magGrad, 0.0, 1.0, NORM_MINMAX);

    //% Determine Hysteresis Thresholds

    // -------------------------------------------------
    //% Set magic numbers
    const int NUM_BINS = 64;
    const double percent_of_pixels_not_edges = 0.9;
    const double threshold_ratio = 0.25;
    // -------------------------------------------------

    //% Compute histogram
    int bin_size = cvFloor(maxGrad / float(NUM_BINS) + 0.5f) + 1;
    if (bin_size < 1) bin_size = 1;
    int bins[NUM_BINS] = { 0 };
    for (i = 0; i<size.height; ++i)
    {
        float *_pmag = magGrad.ptr<float>(i);
        for (j = 0; j<size.width; ++j)
        {
            int hgf = int(_pmag[j]);
            bins[int(_pmag[j]) / bin_size]++;
        }
    }




    //% Select the thresholds
    float total(0.f);
    float target = float(size.height * size.width * percent_of_pixels_not_edges);
    int low_thresh, high_thresh(0);

    while (total < target)
    {
        total += bins[high_thresh];
        high_thresh++;
    }
    high_thresh *= bin_size;
    low_thresh = cvFloor(threshold_ratio * float(high_thresh));

    if (flags & CV_CANNY_L2_GRADIENT)
    {
        Cv32suf ul, uh;
        ul.f = (float)low_thresh;
        uh.f = (float)high_thresh;

        low = ul.i;
        high = uh.i;
    }
    else
    {
        low = cvFloor(low_thresh);
        high = cvFloor(high_thresh);
    }


    buffer.allocate((size.width + 2)*(size.height + 2) + (size.width + 2) * 3 * sizeof(int));
    mag_buf[0] = (int*)(char*)buffer;
    mag_buf[1] = mag_buf[0] + size.width + 2;
    mag_buf[2] = mag_buf[1] + size.width + 2;
    map = (uchar*)(mag_buf[2] + size.width + 2);
    mapstep = size.width + 2;

    maxsize = MAX(1 << 10, size.width*size.height / 10);
    stack.resize(maxsize);
    stack_top = stack_bottom = &stack[0];

    memset(mag_buf[0], 0, (size.width + 2)*sizeof(int));
    memset(map, 1, mapstep);
    memset(map + mapstep*(size.height + 1), 1, mapstep);

    /* sector numbers
    (Top-Left Origin)

    1   2   3
    *  *  *
    * * *
    0*******0
    * * *
    *  *  *
    3   2   1
    */

#define CANNY_PUSH(d)    *(d) = (uchar)2, *stack_top++ = (d)
#define CANNY_POP(d)     (d) = *--stack_top

    mag_row = cvMat(1, size.width, CV_32F);

    // calculate magnitude and angle of gradient, perform non-maxima supression.
    // fill the map with one of the following values:
    //   0 - the pixel might belong to an edge
    //   1 - the pixel can not belong to an edge
    //   2 - the pixel does belong to an edge
    for (i = 0; i <= size.height; i++)
    {
        int* _mag = mag_buf[(i > 0) + 1] + 1;
        float* _magf = (float*)_mag;
        const short* _dx = (short*)(dx->data.ptr + dx->step*i);
        const short* _dy = (short*)(dy->data.ptr + dy->step*i);
        uchar* _map;
        int x, y;
        ptrdiff_t magstep1, magstep2;
        int prev_flag = 0;

        if (i < size.height)
        {
            _mag[-1] = _mag[size.width] = 0;

            if (!(flags & CV_CANNY_L2_GRADIENT))
                for (j = 0; j < size.width; j++)
                    _mag[j] = abs(_dx[j]) + abs(_dy[j]);

            else
            {
                for (j = 0; j < size.width; j++)
                {
                    x = _dx[j]; y = _dy[j];
                    _magf[j] = (float)std::sqrt((double)x*x + (double)y*y);
                }
            }
        }
        else
            memset(_mag - 1, 0, (size.width + 2)*sizeof(int));

        // at the very beginning we do not have a complete ring
        // buffer of 3 magnitude rows for non-maxima suppression
        if (i == 0)
            continue;

        _map = map + mapstep*i + 1;
        _map[-1] = _map[size.width] = 1;

        _mag = mag_buf[1] + 1; // take the central row
        _dx = (short*)(dx->data.ptr + dx->step*(i - 1));
        _dy = (short*)(dy->data.ptr + dy->step*(i - 1));

        magstep1 = mag_buf[2] - mag_buf[1];
        magstep2 = mag_buf[0] - mag_buf[1];

        if ((stack_top - stack_bottom) + size.width > maxsize)
        {
            int sz = (int)(stack_top - stack_bottom);
            maxsize = MAX(maxsize * 3 / 2, maxsize + 8);
            stack.resize(maxsize);
            stack_bottom = &stack[0];
            stack_top = stack_bottom + sz;
        }

        for (j = 0; j < size.width; j++)
        {
#define CANNY_SHIFT 15
#define TG22  (int)(0.4142135623730950488016887242097*(1<<CANNY_SHIFT) + 0.5)

            x = _dx[j];
            y = _dy[j];
            int s = x ^ y;
            int m = _mag[j];

            x = abs(x);
            y = abs(y);
            if (m > low)
            {
                int tg22x = x * TG22;
                int tg67x = tg22x + ((x + x) << CANNY_SHIFT);

                y <<= CANNY_SHIFT;

                if (y < tg22x)
                {
                    if (m > _mag[j - 1] && m >= _mag[j + 1])
                    {
                        if (m > high && !prev_flag && _map[j - mapstep] != 2)
                        {
                            CANNY_PUSH(_map + j);
                            prev_flag = 1;
                        }
                        else
                            _map[j] = (uchar)0;
                        continue;
                    }
                }
                else if (y > tg67x)
                {
                    if (m > _mag[j + magstep2] && m >= _mag[j + magstep1])
                    {
                        if (m > high && !prev_flag && _map[j - mapstep] != 2)
                        {
                            CANNY_PUSH(_map + j);
                            prev_flag = 1;
                        }
                        else
                            _map[j] = (uchar)0;
                        continue;
                    }
                }
                else
                {
                    s = s < 0 ? -1 : 1;
                    if (m > _mag[j + magstep2 - s] && m > _mag[j + magstep1 + s])
                    {
                        if (m > high && !prev_flag && _map[j - mapstep] != 2)
                        {
                            CANNY_PUSH(_map + j);
                            prev_flag = 1;
                        }
                        else
                            _map[j] = (uchar)0;
                        continue;
                    }
                }
            }
            prev_flag = 0;
            _map[j] = (uchar)1;
        }

        // scroll the ring buffer
        _mag = mag_buf[0];
        mag_buf[0] = mag_buf[1];
        mag_buf[1] = mag_buf[2];
        mag_buf[2] = _mag;
    }

    // now track the edges (hysteresis thresholding)
    while (stack_top > stack_bottom)
    {
        uchar* m;
        if ((stack_top - stack_bottom) + 8 > maxsize)
        {
            int sz = (int)(stack_top - stack_bottom);
            maxsize = MAX(maxsize * 3 / 2, maxsize + 8);
            stack.resize(maxsize);
            stack_bottom = &stack[0];
            stack_top = stack_bottom + sz;
        }

        CANNY_POP(m);

        if (!m[-1])
            CANNY_PUSH(m - 1);
        if (!m[1])
            CANNY_PUSH(m + 1);
        if (!m[-mapstep - 1])
            CANNY_PUSH(m - mapstep - 1);
        if (!m[-mapstep])
            CANNY_PUSH(m - mapstep);
        if (!m[-mapstep + 1])
            CANNY_PUSH(m - mapstep + 1);
        if (!m[mapstep - 1])
            CANNY_PUSH(m + mapstep - 1);
        if (!m[mapstep])
            CANNY_PUSH(m + mapstep);
        if (!m[mapstep + 1])
            CANNY_PUSH(m + mapstep + 1);
    }

    // the final pass, form the final image
    for (i = 0; i < size.height; i++)
    {
        const uchar* _map = map + mapstep*(i + 1) + 1;
        uchar* _dst = dst->data.ptr + dst->step*i;

        for (j = 0; j < size.width; j++)
        {
            _dst[j] = (uchar)-(_map[j] >> 1);
        }
    }
};

void Canny3(InputArray image, OutputArray _edges,
    OutputArray _sobel_x, OutputArray _sobel_y,
    int apertureSize = 3, bool L2gradient = false)
{
    Mat src = image.getMat();
    _edges.create(src.size(), CV_8U);
    _sobel_x.create(src.size(), CV_16S);
    _sobel_y.create(src.size(), CV_16S);


    CvMat c_src = src, c_dst = _edges.getMat();
    CvMat c_dx = _sobel_x.getMat();
    CvMat c_dy = _sobel_y.getMat();


    cvCanny3(&c_src, &c_dst,
        &c_dx, &c_dy,
        apertureSize + (L2gradient ? CV_CANNY_L2_GRADIENT : 0));
};

int main()
{
    Mat3b img = imread("path_to_image");
    Mat1b gray;
    cvtColor(img, gray, COLOR_BGR2GRAY);

    Mat1b edges;
    Mat1s sobel_x, sobel_y;
    Canny3(gray, edges, sobel_x, sobel_y);

    imshow("edges", edges);
    waitKey();

    return 0;
}
Miki
  • 40,887
  • 13
  • 123
  • 202