I've spent some months studying and making experiments with the process of keypoint detection, description and matching. In the last period, I'm also into the concepts behind augmented reality, precisely the "markerless" recognition and pose estimation.
Luckily, I've found that the previous concepts are still widely used in this setting. A common pipeline to create a basic augmented reality is the following, without going on details about each needed algorithm:
While capturing a video, at every frame...
- Get some keypoints and create their descriptors
- Find some matches between these points and the ones inside a previously saved "marker" (like a photo)
- If matches are enough, estimate the pose of the visible object and play with it
That is, a very simplified procedure used, for example, by this student(?) project.
Now the question: during my personal researches, I've also found another method called "optical flow". I'm still at the beginning of the study, but first I would like to know how much different is it from the previous method. Specifically:
- What are the main concepts behind it? Does it use a "subset" of the algorithms roughly described before?
- What are the main differences in term of computational costs, performance, stability and accurancy? (I know this could be a too generalized question)
- Which one of those is more used in commercial AR tools? (junaio, Layar, ...)
Thanks for your cooperation.