1

I have an optimization problem where I have a set of providers P selling objects Op of different types with different performance vectors Pv=[p1, p2, p3, ..., pn]and a set of client requests R asking for objects Or with an expected performance vectors Er=[e1, e2, ..., en].

I would like to compute what are the provider's objects that are close enough to the ones requested by clients given the performance vectors, I have looked at some measures like : Euclidian squarred distance but I am not sure how to use it since the units of the performance vectors are different i.e p1 is measured in seconds, p2 is measured in dollars and so on...

Could anyone shed some light and suggest a methodology ?

user2567806
  • 460
  • 3
  • 7
  • 17

1 Answers1

1

The first idea you should try is to scale each of your features independently before comparing them.

For instance, get all your p1 samples, compute mean and standard deviation, then transform your samples to (s - mean)/std. Do this for each of your features, except for those that are already binary (0/1).

Then you can use Euclidian distance as a first trial for analyze if the points are far or not.

Similarity measures are something different, yet similar, you can use something like e^(-distance(x, y)) to get a similarity between 0 and 1, and there are other measures that could try as well. You should use these on the scaled data, not the original one.

Matthieu Brucher
  • 21,634
  • 7
  • 38
  • 62