I'm implementing a content-based image retrieval (CBIR) based on feature extraction by histogram, HOG and local binary pattern. Each of these (normalized) feature extractions are stored separately in a csv
file to calculate distances in the further step. This file looks like this:
img_ID0, 0.0, 0.0, 0.0, 0.4, 0.1, ...
img_ID1, 0.0, 0.1, 0.0, 0.2, 0.1, ...
img_ID2, 0.2, 0.0, 0.0, 0.4, 0.0, ...
I flatten the ndarray
and normalizing along the entire flattened array. Which should be the sample-wise normalization (I'm not sure about it, so please correct me)
Now, how would a feature-wise normalization look like? Especially if I don't really have "named" columns? Should I have normalized along the (not flattened) image or later on on the flattened arrays column-wise over all images?
Literature just says, that feature-wise is commonly used, but it still depends on the application. CBIR seems to be very vague about this.