2

I have two lists, A and B. A will be at most ~1000 elements, and B will be at most ~100 elements. I want to match every element of B to an element of A, such that the sum of the absolute differences of the pairs is minimized.

i.e. I want to choose |B| distinct indexes from A, and assign them to the indexes of B, such that the following sum is minimized: sum(abs(A[j] - B[i]) for i in |B|, j = index_mapping(i))

My first approach is:

  1. For each element of B, to compute the |B| closest elements of A.
  2. Choose the pairs in a greedy fashion (i.e. minimum error first)

Playing with some simple examples, it's clear that my approach is not the best. It should work fine for my purpose, but I was wondering if anyone could suggest a better approach?

Levi
  • 347
  • 3
  • 7

2 Answers2

1

I ended up sorting both lists, and iterating through them to match. This worked well enough for what I was doing.

Levi
  • 347
  • 3
  • 7
0

Hmm... Offhand, the first thought that comes to mind is that if you can sort A and B, then once you find the first mapping of A[j] to B[i], then for B[i+1], you could start your testing with A[j] instead of A[0].

For example:

A = [ 23, 34, 38, 52, 67, 68, 77, 80, 84, 95 ]
B = [ 31, 33, 64, 65, 99 ]

You start with B[0] = 31 and step through A until you find the closest match, A[1]. Since the lists are ordered, you know that B[1] won't match anything less than A[1], so you can start comparing from there. Turns out, A[1] is still the closest match. At B[2], the closest match is A[4], so you know that B[3] won't match anything lower than A[4], there's no need to search A[0] through A[3].

King Skippus
  • 3,801
  • 1
  • 24
  • 24
  • Ah, I just noticed that you want every index in B to match a *unique* index in A. That's a tough one, it would probably involve some kind of traveling salesman algorithm where you iterate over the set of all matches, tallying the sum of distances for every one, and choosing the lowest one. – King Skippus Jun 02 '11 at 04:56