4

I am confused as to how to use the OpenCV findHomography method to compute the optimal transformation.

The way I use it is as follows:

cv::Mat h = cv::findHomography(src, dst, CV_RANSAC, 5.f);

No matter how many times I run it, I get the same transformation matrix. I thought RANSAC is supposed to randomly select a subset of points to do the fitting, so why does it return the same transformation matrix every time? Is it related to some random number initialization? How can I make this behaviour actually random?

Secondly, how can I tune the number of RANSAC iterations in this setup? Usually the number of iterations is based on inlier ratios and things like that.

Mark Amery
  • 143,130
  • 81
  • 406
  • 459
Luca
  • 10,458
  • 24
  • 107
  • 234
  • RANSAC methods can be configured that it chooses a number of iterations automatically, or stop iterating when a probability that the "right" model was chosen us high enough. Read Zisserman for details. Probably openCV uses that method and probably it always finds that model in your data. – Micka Aug 16 '15 at 15:46
  • Just to provide some more info, as pointed in https://answers.opencv.org/question/86554/opencv-ransac-random-sequence/ openCV fix the random seed to get the samples. Thereby one will always obtain the same result if the arguments are not changed. – Javier TG Jun 24 '22 at 10:21

2 Answers2

4

I thought RANSAC is supposed to randomly select a subset of points to do the fitting, so why does it return the same transformation matrix every time?

RANSAC repeatedly selects a subset of points, then fits a model based upon them, then checks how many data points in the data set are inliers given that fitted model. Once it's done that lots of times, it picks the fitted model that had the most inliers, and refits the model to those inliers.

For any given data set, set of variable model parameters, and rule for what constitutes an inlier, there will exist one or more (but often exactly one) largest possible set of "inliers". For example, given this data set (image from Wikipedia):

A graph showing a straight line of points, with a bunch of random outliers scattered around

... then with some sort of reasonable definition of an outlier, the maximal possible set of inliers any linear model can have is the one in blue below:

Same image as before, but with the inliers in blue, the outliers in red, and a blue line of best fit through the inliers

Let's call the set of blue points above - the maximal possible set of inliers - I.

If you randomly select a small number of points (e.g. two or three) and draw a line of best fit through them, it's hopefully intuitively obvious that it'll only take you a handful of tries until you hit an iteration where:

  • all the randomly-selected points you pick are from I, and so
  • the line of best fit through those points is roughly equal to the line of best fit in the graph above, and so
  • the set of inliers found on that iteration is exactly I

From that iteration onwards, all further iterations are a waste that cannot possibly improve the model further (although RANSAC has no way of knowing this, since it doesn't magically know when it's found the maximal set of inliers).

If you have a large enough number of iterations relative to the size of your data set, and a large enough proportion of the data set are inliers, then you will eventually find the maximal set of inliers with a close to 100% chance every time you run RANSAC. As a consequence, RANSAC will (almost) always output exactly the same model.

And that's a good thing! Often, you want RANSAC to find the absolute maximal set of inliers and don't want to settle for anything less. If you're getting different results each time you run RANSAC in such a scenario, that's a sign that you want to increase your number of iterations.

(Of course, in the case above we're talking about trying to fit a line through points in a 2D plane, which isn't what findHomography does, but the principle is the same; there will typically still be a single maximal set of inliers and eventually RANSAC will find it.)

How can I make this behaviour actually random?

Decrease the number of iterations (maxIters) so that RANSAC sometimes fails to find the maximal set of inliers.

But there's generally no reason to do this besides pure intellectual curiosity; you'll basically be deliberately telling RANSAC to output an inferior model.

Mark Amery
  • 143,130
  • 81
  • 406
  • 459
3

findHomography will already give you the optimal transformation. The real question is about the meaning of optimal.

For example, with RANSAC you'll have the model with maximum number of inliers, while with LMEDS you'll have the model with minimum median error.

You can modify default behavior by:

  • changing the number of iteration of RANSAC by setting maxIters (max number allowed is 2000)
  • decreasing (increasing) the ransacReprojThreshold used to validate a inliers and outliers (usually between 1 and 10).

Regarding you questions.

No matter how many times I run it, I get the same transformation matrix.

Probably your points are good enough that you find always the optimal model.

I thought RANSAC is supposed to randomly select a subset of points to do the fitting

RANSAC (RANdom SAmple Consensus) first selects a random subset, the checks if the model built with these points is good enough. If not, it selects another random subset.

How can I make this behaviour actually random?

I can't imagine a scenario where this would be useful, but you can randomly select 4 couples of points from src and dst, and use getPerspectiveTransform. Unless your points are perfect, you'll get a different matrix for each subset.

Miki
  • 40,887
  • 13
  • 123
  • 202
  • "with RANSAC you'll have the model with minimum reprojection error" - actually you have the model with maximum number of inliers, which is not neccessarily the model with minimum reprojection error. – Miau Jul 15 '21 at 22:03