I read both Lowe's papers ('99&'04) and I would say I understood most of them. I saw all SIFT related classes on youtube, but none explicitly says why we would use both octaves and layers?
I understood that you get more layers in the same octave by ~calculating the ~Laplacian for different sigmas and then you resample to half the resolution to get the next octave, and again ~calculating the ~Laplacian for the same sigmas as in the first octave. And then you do this as many times as you feel like doing it.
Initially, I thought that you use the layers (multiple sigmas) to find features of different sizes on one image, and then you resample, so that you calculate descriptors on every octave (resampling level) for every feature, so that you get descriptors at different scales that might be better matches for descriptors in the other image at a similar scale. Apparently, I was wrong, only one descriptor is calculated for every feature, as it is calculated out of gradient orientations, so it is ~invariant to scale anyway.
But this leaves me wondering, why do we need to resample, why can't or shouldn't just use a high number of layers and just one octave (no resampling). Is this just because it is cheaper to resample? If yes, why don't we just resample?
I did an experiment using OpenCV to see how and what is detected. Here are my observations:
1 octave, 1 layer => all features have the exact same size, as expected, 263 matches found
1 octave, 2 layers => all features from 1o1l test are found, plus some other features that are about x1.35 larger than the small ones, 326 matches found.
2 octave, 1 layer => most features from 1o1l test are found(maybe all), plus some other features that are exactly x2 as big, which is again expected since I resampled at half, 318 matches found.
2 octave, 2 layer => features have the x1 size, x1.35 or x2 size. I couldn't find any x2.7 size as I would have expected. Only 299 matches found. I suppose that now that there were multiple closer layers, more things looked too much alike and failed the ratio test, so more layers, might actually decrease the number of tiepoints.
Note: ~ sign means sort of. I use it when I know it is not the exact explanation, but the exact one would be longer and it wouldn't add any value to the question.