The term feature is commonly used for two different things:
- feature detectors,
- feature descriptors.
A detector aims at.. well... detecting good interesting points, i.e., points that are stable under viewpoint and illumination changes and that yield good performance in tasks like homography estimation or object detection.
A descriptor aims at reaching good matching performance for detected points under the said viewpoint and illumination changes.
Some points were designed individually, without any descriptor. This is the case for most of the oldest interest points (Moravec, Harris, good features) and a small portion of the recent ones (FAST).
Then, a major performance improvement was reached through the co-design of point detectors and descriptors, and this is the approach embraced by SIFT and SURF.
For simplicity, the descriptor was not given a particular name (although you can remark that SIFT descriptors and HoG features are very close to each other).
These descriptors are real valued (i.e., floating point vectors).
Finally, in order to have fast running times on limited hardware, an original keypoint detector (FAST) was designed. FAST relies on simple binary tests.
The same approach of binary tests was then used to design descriptors, and this is how you got BRIEF, BRISK, FREAK, ORB...
Thus, what you get is binary descriptors (bitstreams).
Finally, if you want to summarize:
- you can cross descriptors and detectors as you like. Just be careful that when a detector does not have a scale you may have to guess one (or impose a default one) for the descriptors that require it (SIFT, SURF);
- any matcher can be used as long as you have the same type of descriptors from each image. What will vary is the feature distance used by the matcher;
- SIFT and SURF are real valued, thus need to be matched using an L2 distance. Recent descriptors (BRIEF, BRISK, FREAK, ORB) are binary and distances must be measured with the Hamming distance.