You are digging too deep by your question. Well, according to the OpenCV documentation:
predict()
Predicts a label and associated confidence (e.g. distance) for a given
input image
I am not sure what are you looking for here but the question is not really easy to be answered. Intra-person face variants (variation of the same person) are vast and inter-person face variation (faces from different persons) can be more compact (e.g. when both face front while the intra-person second facial image is profile) so this is a whole topic that expect an answer.
Probably you should have a ground truth (i.e. some faces with labels already known) and deduct form this set the percentage you want associating the distances with the labels. Though this is also often inaccurate as distance would not coincide with your perception of similarity (as mentioned before inter-person faces can vary a lot).
Edit:
First of all, there is no universal human perception of face similarity. On the other half, most people would recognize a face that belongs to the same person in various poses and postures. Most word here is important. As you pressure the limits the human perception will start to diverge, e.g. when asked to recognize a face over the years and the time span becomes quite large (child vs adolescence vs old person).
You are asking to compute the similarity of noses/eyes etc? If so, I think the best way is to find a set of noses/eyes belonging to the same persons and train over this and then check your performance on a different set from different persons.
The usual approach as I know is to train and test using pairs of images comprising positive and negative samples. A positive sample is a pair of images belonging to the same person while a negative one is an image pair belong to two different ones.
I am not sure what you are asking exactly so maybe you can check out this link.
Hope it helped.
Edit 2:
Well, since you want to convert the distance that you are getting to a similarity expressed as percentage you can somehow invert the distance to get the similarity. There are some problems arising here though:
- There is a value for absolute match, that is
dis = 0;
or equivalently similarity is sim = 100%
but there is no value explicit for total mismatch: dis = infinite
so sim = 0%
. On the other hand the inverse progress has explicit boundaries 0% - 100%
.
- Since extreme values include 0 and infinite there must be a smarter conversion than simple inversion.
You can easily assign 1.0
(or 100%
to similarity) corresponding to the absolute match but what you are going to take as total mismatch is not clear. You can consider an arbitrary high value as 0.0
(since you there is no big difference e.g. in using distance 10000 to 11000 I guess) and all values higher than this (distance values that is) to be considered 0.0
.
To find which value that should be I would suggest to compare two quite distinct images and use the distance between them as 0.0.
Let's suppose that this value is disMax = 250.0;
and simMax = 100.0;
then a simple approach could be:
double sim = simMax - simMax/disMax*dis;
which gives a 100.0 similarity for 0 distance and 0.0 for 250 distance. Values larger than 250 would give negative similarity values which should be considered 0.0.