How to normalize feature-wise and sample-wise?

Question

I'm implementing a content-based image retrieval (CBIR) based on feature extraction by histogram, HOG and local binary pattern. Each of these (normalized) feature extractions are stored separately in a csv file to calculate distances in the further step. This file looks like this:

img_ID0, 0.0, 0.0, 0.0, 0.4, 0.1, ...
img_ID1, 0.0, 0.1, 0.0, 0.2, 0.1, ...
img_ID2, 0.2, 0.0, 0.0, 0.4, 0.0, ...

I flatten the ndarray and normalizing along the entire flattened array. Which should be the sample-wise normalization (I'm not sure about it, so please correct me)

Now, how would a feature-wise normalization look like? Especially if I don't really have "named" columns? Should I have normalized along the (not flattened) image or later on on the flattened arrays column-wise over all images?

Literature just says, that feature-wise is commonly used, but it still depends on the application. CBIR seems to be very vague about this.

score 0 · Answer 1 · answered Apr 08 '20 at 14:01

0

Assuming your data before any normalization looks like this:

img_ID0, feat1_val, feat2_val, feat3_val,...
img_ID1, feat1_val, feat2_val, feat3_val,...
img_ID2, feat1_val, feat2_val, feat3_val,...

Each line is a new image(=sample), and each column is a feature. In that case, samplewise normalization would be normalizing along each line ("What is the relative value of feature X compared to feature Y for sample N?"), and featurewise normalization would be normalizing along each column("What is the relative value of feature X for sample N compared to sample M?").

Flattening the array before normalizing like you did would be yet another type of normalization. Also see https://stats.stackexchange.com/questions/354774/should-i-normalize-featurewise-or-samplewise

answered Apr 08 '20 at 14:01

al-dev

256
2
8

Alright, that answer brings me into the right direction. You're mentioning that flattening the array beforehand would be another type of normalization - which one? – Viktoria Apr 08 '20 at 14:53
Oh and one more thing: it is rather `feat_val1, feat_val2, feat_val3`. E.g. I'm calculating the histogram for every image and save it as an array. This array is are those values. – Viktoria Apr 08 '20 at 15:02
Flattening the array beforehand would give you a 'global scaling/normalization'. I am unaware of a better term. – al-dev Apr 08 '20 at 15:09

How to normalize feature-wise and sample-wise?

1 Answers1