Came across Andrew Ng's non-linear hypothesis of neural networks where I had an MCQ to find the number of features for an image of resolution 100x100 of greyscale intensities.
And the answer was 50 million, 5 x 10^7.
However, earlier for a 50 x 50 pixel grey scale image, the number of features is 50x50 (2500) and for RGB image, it is 7500.
Why would it be 5 x 10^7 instead of 10,000?
He does however say including all quadratic terms (xi,xj) as features.
The question is:
Suppose you are learning to recognize cars from 100×100 pixel images (grayscale, not RGB). Let the features be pixel intensity values. If you train logistic regression including all the quadratic terms (xi,xj) as features, about how many features will you have?
And earlier he added that, if we were to use xi, xj ,we would end up with a total of 3 million features. Still I couldn't what relation is this?