6

Question is, how to clusterize pairs of some units by their angle? Problem is that, kmeans operates on the notion of Euclidean space distance and does not know about periodic nature of angles. So to make it work, one needs to translate the angle to Euclidean space but hold the following true:

  1. close angles are close values in Euclidean space;
  2. far angles are far in Euclidean space.

Which means, that, 90 and -90 are distant values, 180 and -180 is the same, 170 and -170 are close (angles come from left up and to right: 0 - +180 and from left down to the right: 0 - -180)

I tried to use various sin() functions but they all have issues mentioned in points 1 and 2. Most perspective one is sin(x * 0.5f) but also having the problem that 180 and -180 are distant values in Euclidean space.

Skippy le Grand Gourou
  • 6,976
  • 4
  • 60
  • 76
Cynichniy Bandera
  • 5,991
  • 2
  • 29
  • 33
  • 1
    It would be much better to post the question and then post the answer as...the answer. – rbaleksandar Mar 18 '16 at 19:39
  • they definitely need to add functionality for making posts. – Cynichniy Bandera Mar 18 '16 at 20:22
  • 1
    Not at all. This is a Q&A site. If the question (along with the answers) is deemed worthy it can become part of the community wiki. For now and for such situations as yours just consider the way I've described in my previous comment as the correct one. – rbaleksandar Mar 18 '16 at 22:29
  • Umka, the _correct_ way of doing this is to post a question, and write your own answer. You can then accept your own answer, or, if other answers are better, you can accept another one. That's your choice. Doing this you'll also earn rep by upvotes on you answer – Miki Mar 19 '16 at 01:11
  • Btw, a little verbose... But good job ;D – Miki Mar 19 '16 at 01:14
  • Sorry but I cannot find how to answer my own question. It simply does not memo field like for other questions. – Cynichniy Bandera Mar 19 '16 at 08:25

1 Answers1

0

The solution I found is to translate angles to points on circle and feed them into kmeans. This way we make it to compare distances between points and this works perfectly.

Important thing to mention. Kmeans @eps in termination criterion is expressed in terms of units of samples that you feed to kmeans. In our example maximal distant points have dist 200 units (2 * radius). This means that having 1.0f is totally fine. If you use cv::normalize(samples, samples, 0.0f, 1.0f) for your samples before calling kmeans(), adjust your @eps appropriately. Something like eps=0.01f plays better here.

Enjoy! Hope this helps someone.

static cv::Point2f angleToPointOnCircle(float angle, float radius, cv::Point2f origin /* center */)
{
        float x = radius * cosf(angle * M_PI / 180.0f) + origin.x;
        float y = radius * sinf(angle * M_PI / 180.0f) + origin.y;
        return cv::Point2f(x, y);
}

static std::vector<std::pair<size_t, int> > biggestKmeansGroup(const std::vector<int> &labels, int count)
{
        std::vector<std::pair<size_t, int> > indices;
        std::map<int, size_t> l2cm;

        for (int i = 0; i < labels.size(); ++i)
                l2cm[labels[i]]++;

        std::vector<std::pair<size_t, int> > c2lm;
        for (std::map<int, size_t>::iterator it = l2cm.begin(); it != l2cm.end(); it++)
                c2lm.push_back(std::make_pair(it->second, it->first)); // count, group

        std::sort(c2lm.begin(), c2lm.end(), cmp_pair_first_reverse);
        for (int i = 0; i < c2lm.size() && count-- > 0; i++)
                indices.push_back(c2lm[i]);
        return indices;
}

static void sortByAngle(std::vector<boost::shared_ptr<Pair> > &group,
                        std::vector<boost::shared_ptr<Pair> > &result)
{
        std::vector<int> labels;
        cv::Mat samples;

        /* Radius is not so important here. */
        for (int i = 0; i < group.size(); i++)
                samples.push_back(angleToPointOnCircle(group[i]->angle, 100, cv::Point2f(0, 0)));

        /* 90 degrees per group. May be less if you need it. */
        static int PAIR_MAX_FINE_GROUPS = 4;

        int groupNr = std::max(std::min((int)group.size(), PAIR_MAX_FINE_GROUPS), 1);
        assert(group.size() >= groupNr);
        cv::kmeans(samples.reshape(1, (int)group.size()), groupNr, labels,
                   cvTermCriteria(CV_TERMCRIT_EPS/* | CV_TERMCRIT_ITER*/, 30, 1.0f),
                   100, cv::KMEANS_RANDOM_CENTERS);

        std::vector<std::pair<size_t, int> > biggest = biggestKmeansGroup(labels, groupNr);
        for (int g = 0; g < biggest.size(); g++) {
                for (int i = 0; i < group.size(); i++) {
                        if (labels[i] == biggest[g].second)
                                result.push_back(group[i]);
                }
        }
}
Skippy le Grand Gourou
  • 6,976
  • 4
  • 60
  • 76