8

I have the following image mask:

mask

I want to apply something similar to cv::findContours, but that algorithm only joins connected points in the same groups. I want to do this with some tolerance, i.e., I want to add the pixels near each other within a given radius tolerance: this is similar to Euclidean distance hierarchical clustering.

Is this implemented in OpenCV? Or is there any fast approach for implementing this?

What I want is something similar to this,

http://www.pointclouds.org/documentation/tutorials/cluster_extraction.php

applied to the white pixels of this mask.

Thank you.

Humam Helfawi
  • 19,566
  • 15
  • 85
  • 160
manatttta
  • 3,054
  • 4
  • 34
  • 72
  • It's not clear what you want the algorithm to do. Can you show another image with the expected result? At one level, it seems like morphological operators would give you all you need so I'm sure that can't be the case. We need to see the effect you're trying to achieve. – Roger Rowland Nov 20 '15 at 11:29
  • @RogerRowland No morphological operators are not an option as they will distort my edges. What I want is to group the edges in my mask image by euclidean distance between them. Something similar to http://www.pointclouds.org/documentation/tutorials/cluster_extraction.php – manatttta Nov 20 '15 at 11:35
  • 1
    I think @Humam's suggestion is a good one, although there is no OpenCV implementation. For clustering tasks in OpenCV you won't get much more than k-means or mean shift. However, as you already linked an example algorithm, it might be more straightforward to just port that to OpenCV (and presumably you don't need 3D). – Roger Rowland Nov 20 '15 at 11:48
  • 1
    @RogerRowland [cv::partition](http://docs.opencv.org/2.4/modules/core/doc/clustering.html#partition) is perfect for this task. – Miki Nov 20 '15 at 18:59
  • 1
    @Miki Good stuff, that's a new one for me, +1 for your comment and answer. – Roger Rowland Nov 20 '15 at 19:09

2 Answers2

12

You can use partition for this:

partition splits an element set into equivalency classes. You can define your equivalence class as all points within a given euclidean distance (radius tolerance)

If you have C++11, you can simply use a lambda function:

int th_distance = 18; // radius tolerance

int th2 = th_distance * th_distance; // squared radius tolerance
vector<int> labels;

int n_labels = partition(pts, labels, [th2](const Point& lhs, const Point& rhs) {
    return ((lhs.x - rhs.x)*(lhs.x - rhs.x) + (lhs.y - rhs.y)*(lhs.y - rhs.y)) < th2; 
});

otherwise, you can just build a functor (see details in the code below).

With appropriate radius distance (I found 18 works good on this image), I got:

enter image description here

Full code:

#include <opencv2\opencv.hpp>
#include <vector>
#include <algorithm>

using namespace std;
using namespace cv;

struct EuclideanDistanceFunctor
{
    int _dist2;
    EuclideanDistanceFunctor(int dist) : _dist2(dist*dist) {}

    bool operator()(const Point& lhs, const Point& rhs) const
    {
        return ((lhs.x - rhs.x)*(lhs.x - rhs.x) + (lhs.y - rhs.y)*(lhs.y - rhs.y)) < _dist2;
    }
};

int main()
{
    // Load the image (grayscale)
    Mat1b img = imread("path_to_image", IMREAD_GRAYSCALE);

    // Get all non black points
    vector<Point> pts;
    findNonZero(img, pts);

    // Define the radius tolerance
    int th_distance = 18; // radius tolerance

    // Apply partition 
    // All pixels within the radius tolerance distance will belong to the same class (same label)
    vector<int> labels;

    // With functor
    //int n_labels = partition(pts, labels, EuclideanDistanceFunctor(th_distance));

    // With lambda function (require C++11)
    int th2 = th_distance * th_distance;
    int n_labels = partition(pts, labels, [th2](const Point& lhs, const Point& rhs) {
        return ((lhs.x - rhs.x)*(lhs.x - rhs.x) + (lhs.y - rhs.y)*(lhs.y - rhs.y)) < th2;
    });

    // You can save all points in the same class in a vector (one for each class), just like findContours
    vector<vector<Point>> contours(n_labels);
    for (int i = 0; i < pts.size(); ++i)
    {
        contours[labels[i]].push_back(pts[i]);
    }

    // Draw results

    // Build a vector of random color, one for each class (label)
    vector<Vec3b> colors;
    for (int i = 0; i < n_labels; ++i)
    {
        colors.push_back(Vec3b(rand() & 255, rand() & 255, rand() & 255));
    }

    // Draw the labels
    Mat3b lbl(img.rows, img.cols, Vec3b(0, 0, 0));
    for (int i = 0; i < pts.size(); ++i)
    {
        lbl(pts[i]) = colors[labels[i]];
    }

    imshow("Labels", lbl);
    waitKey();

    return 0;
}
Miki
  • 40,887
  • 13
  • 123
  • 202
  • +1 for an OpenCV approach. -1 for non-density based approach. The threshold of the distance will be a nightmare. I would be glad if you share your thought about a universal way to adjust the threshold in this problem ;) – Humam Helfawi Nov 21 '15 at 22:10
  • @HumamHelfawi Well, thanks for +1 ;D. Regarding -1 let me answer back: 1) DBSCAN (at least in original formulation) requires 2 parameters, one of which is exactly the distance threshold (as here). 2) The question is explicitly on clustering within a radius tolerance, not on a density-based approach. 3) The question doesn't mention to be robust to outliers. So that's why a proposed this solution. – Miki Nov 21 '15 at 23:38
  • @HumamHelfawi regarding adjusting the threshold, it really depends on the final goal of the OP. He probably needs to put in the same class all points belonging to the same circle. In this kind of image (basically no noise) this can be accomplished easily with some variants of randomized hough transform (but you still need to set a few parameters). Let me think on some other approaches for a while :D – Miki Nov 21 '15 at 23:43
  • @Miki I did not say that my answer is good. I have just criticized your answer :D. just joking.. DBSCAN requires distance parameter, yes this is true. However, DBSCAN works in manner that the distance threshold represent the max valid distance between a point and any other any point within the class so one static threshold should work for all examples. But since the OP stated the tolerance keyword directly, I withdraw my -1 and keep the +1. If I find some time, I may try DBSCAN and post the results here... have a nice day :) – Humam Helfawi Nov 21 '15 at 23:46
  • @HumamHelfawi sure :D I really appreciate constructive criticism. I'm really curious about DBSCAN results, please ping me if you post them. Have a nice day! (sadly for me is a working night) – Miki Nov 21 '15 at 23:52
  • Hello, thank you both @HumamHelfawi and Miki for the answers! both performed quite well. This is a simple case anyway, so I will just stick with the simpler/faster approach! thank you both once again – manatttta Nov 23 '15 at 10:31
  • @Miki I edited my answer to show the results. However, the problem of DBSCAN that uses fixed threshold is that any small noise would be put outside the cluster... – Humam Helfawi Nov 23 '15 at 20:44
  • I used kmeans for things like this. For my particular case this one is better controlled. – Cynichniy Bandera May 15 '16 at 12:00
  • in order to find dist threshold: (1) find avg dist for each of elements to all other elements; (2) find min avg dist. This is your threshold +- (5-10%) – Cynichniy Bandera May 15 '16 at 17:54
2

I suggest to use DBSCAN algorithm. It is exactly what you are looking for. Use a simple Euclidean Distance or even Manhattan Distance may work better. The input is all white points(threshold-ed). The output is a groups of points(your connected component)

Here is a DBSCAN C++ implenetation

EDIT: I tried DBSCAN my self and here is the result: enter image description here

As you see, Just the really connected points are considered as one cluster.

This result was obtained using the standerad DBSCAN algorithm with EPS=3 (static no need to be tuned) MinPoints=1 (static also) and Manhattan Distance

Humam Helfawi
  • 19,566
  • 15
  • 85
  • 160