How to convert distance into probability?

Question

Сan anyone shine a light to my matlab program? I have data from two sensors and i'm doing a kNN classification for each of them separately. In both cases training set looks like a set of vectors of 42 rows total, like this:

[44 12 53 29 35 30 49;

 54 36 58 30 38 24 37;..]

Then I get a sample, e.g. [40 30 50 25 40 25 30] and I want to classify the sample to its closest neighbor. As a criteria of proximity I use Euclidean metrics, sqrt(sum(Y²)), where Y is a difference between each element and it gives me an array of distances between Sample and each Class of Training Set.

So, two questions:

Is it possible to convert distance into distribution of probabilities, something like: Class1: 60%, Class 2: 30%, Class 3: 5%, Class 5: 1%, etc.

added: Up to this moment I'm using formula: probability = distance/sum of distances, but I cannot plot a correct cdf or histogram. This gives me a distribution in some way, but I see a problem there, because if distance is large, for example 700, then the closest class will get a biggest probability, but it'd be wrong because the distance is too big to be compared with any of classes.

If I would be able to get two probability density functions, I guess then I would do some product of them. Is it possible?

Any help or remark is highly appreciated.

probability should always add up to 1 - so you should figure out that your normalization is (some number related to one state) / (sum of numbers corresponding to all states). What that means in your case is a bit hard to judge. — Floris, May 04 '14 at 19:23
thanks for your comments, guys, i understand that total prob. must be equal to 1 and `probability = distance/sum of distances` satisfy it. — niko_dry, May 04 '14 at 19:34
But imagine the situation: minimal distance is 50, the 2nd minimum is 100, the 3rd minimum is 500, while the sum is 30.000, what i obtain from this formula would be: 0,16%, 0.3%, 1.6%...and let's say 33% for the farthest one, it's not the probability, more like % of error, but how to make in more concise? — niko_dry, May 04 '14 at 19:42

score 13 · Answer 1 · answered Sep 22 '17 at 23:23

I think there are multiple way of doing this:

as Adam suggested using 1/d / sum(1/d)
use the square, or even higher ordered of inverse of distance, e.g 1/d^2 / sum(1/d^2), This will make the class probability distribution more skewed. For example if 1/d generated 40%/60% prediction, the 1/d^2 may gave a 10%/90%.
use softmax (https://en.wikipedia.org/wiki/Softmax_function), the exponential of negative distance.
use exp(-d^2)/sigma^2 / sum[exp(-d^2)/sigma^2], this will imitate the Gaussian Distribution likelihoods. Sigma could be the average within-cluster distance, or simply set to 1 for all clusters.

Your 4. is a generalization of your 3., i.e. your 4. is simply `softmax(-d^2/s^2)` — LucasB, Nov 21 '17 at 11:12

score 9 · Answer 2 · answered May 07 '14 at 18:00

9

You could try to inverse your distances to get a likelihood measure. I.e. the bigger the distance x, the smaller the inverse of it. Then, you can normalize as in probability = (1/distance) / (sum (1/distance) )

answered May 07 '14 at 18:00

Adam Kosiorek

1,438
1
13
17

This method is called [inverse distance weighting](https://en.wikipedia.org/wiki/Inverse_distance_weighting). – user76284 Dec 29 '18 at 10:16

score -3 · Answer 3 · answered Mar 19 '19 at 21:55

-3

Hi: Have you ever tried with the formula probability = 1-distance assuming that you are using a standardized distance between 0 and 1?

answered Mar 19 '19 at 21:55

GonzaloMoreno

31
3

1

This does not provide an answer to the question. You can [search for similar questions](//stackoverflow.com/search), or refer to the related and linked questions on the right-hand side of the page to find an answer. If you have a related but different question, [ask a new question](//stackoverflow.com/questions/ask), and include a link to this one to help provide context. See: [Ask questions, get answers, no distractions](//stackoverflow.com/tour) – geisterfurz007 Mar 19 '19 at 21:56

How to convert distance into probability?

3 Answers3