Optical flow vs keypoint matching: what are the differences?

Question

I've spent some months studying and making experiments with the process of keypoint detection, description and matching. In the last period, I'm also into the concepts behind augmented reality, precisely the "markerless" recognition and pose estimation.

Luckily, I've found that the previous concepts are still widely used in this setting. A common pipeline to create a basic augmented reality is the following, without going on details about each needed algorithm:

While capturing a video, at every frame...

Get some keypoints and create their descriptors

Find some matches between these points and the ones inside a previously saved "marker" (like a photo)

If matches are enough, estimate the pose of the visible object and play with it

That is, a very simplified procedure used, for example, by this student(?) project.

Now the question: during my personal researches, I've also found another method called "optical flow". I'm still at the beginning of the study, but first I would like to know how much different is it from the previous method. Specifically:

What are the main concepts behind it? Does it use a "subset" of the algorithms roughly described before?
What are the main differences in term of computational costs, performance, stability and accurancy? (I know this could be a too generalized question)
Which one of those is more used in commercial AR tools? (junaio, Layar, ...)

Thanks for your cooperation.

old-ufo · Accepted Answer · 2014-07-16T06:33:31.797

6

Optical flow(OF) is method around so called "brightness constancy assumption". You assume that pixels - more specific, theirs intensities (up to some delta) - are not changing, only shifting. And you find solution of this equation: I(x,y,t) = I(x+dx, y+dy, t+dt).

First order of Tailor series is: I(x + dx, y+dy, t+ dt) = I (x,y,t) + I_x * dx + I_y * dy + I_t * dt.

Then you solve this equation and get dx and dy - shifts for every pixel.

Optical flow is mainly used for tracking and odometry.

upd.: if applied not to whole image, but the patch, optical flow is almost the same, as Lucas-Kanade-Tomashi tracker.

Difference between this method and feature-based methods is density. With feature points you usually get difference in position of the feature points only, while optical flow estimates it for whole image.

The drawback is that vanilla OF works only for small displacements. For handling larger ones, one can downscale image and calculate OF on it - "coarse-to-fine" method.

One can change "brightness constancy assumption" to, i.e., "descriptor constancy assumption" and solve the same equation but with descriptor value instead of raw intensity. SIFT flow is an example of this .

Unfortunately, I don`t know much about augmented reality commercial solutions and cannot answer the last question.

edited Jul 16 '14 at 06:33

answered Jul 15 '14 at 19:31

old-ufo

2,799
2
28
40

Thank you for this explanation. So the main difference could be in the search method, dense or sparse. Anyway, I suppose that there are also "hybrid" methods in order to increase the quality of the results. A very useful thing for me could be an answer to the last question, about the commercial solutions... but obviously companies think twice when someone asks them about their internal algorithms :) – TheUnexpected Jul 15 '14 at 23:35
2

Things that are incorrect with this answer: 1. Optical flow is not based on the brightness constancy assumption, although most classical methods start there. 2. Optical flow is not mainly used for tracking and odometry, it has many uses such as video compression, interpolation, pedestrian detection etc. Besides that, there are some good points being made – Stefan Karlsson Jun 27 '18 at 10:57
@StefanKarlsson it is true, that current OF methods are deep CNN based and not anymore based on brightness conststancy assumptions. But It starts there as you said. Regarding usage, that is true that applications are very wide and I don`t know all good enough. – old-ufo Jun 27 '18 at 16:16
1

@old-ufo you do not need to go to deep learning to find methods that are not based on brightness constancy. Classical methods predating even Lucas-Kanade and Horn and Schunk are not based on it. Simple correlation block based matching, with proper normalization is dense optical flow, and does not rely on the brightness constancy. Those early methods where out-performed by HS and LK, but still. Note, som really awesome block-based matching algorithms do exist that are more recent – Stefan Karlsson Jun 28 '18 at 11:08

score 6 · Answer 2 · answered Jul 16 '14 at 00:31

Optical flow computation is slow, while recent developments in detection speed has been a lot. Augmented reality commercial solutions require realtime performance. Hence, it is hard to apply optical flow based techniques (until you use a good GPU). AR systems mostly use feature based techniques. Most of the time their aim is to know the 3D geometry of the scene, which can be robustly estimated by a set of points. Other differences have been mentioned by old-ufo.

Optical flow vs keypoint matching: what are the differences?

2 Answers2