7

I have a list of possibilities and a desired input:

possibles = [20, 30, 40, 50, 60, 70, 80, 100]
desired = [20, 30, 40]

I want to generate the close by lists. Example:

# Distance of 1 (i.e. 1 element changes to a close-by)
[30, 30, 40]
[20, 40, 40]
[20, 30, 30]
[20, 30, 50]

# Distance of 2:
[40, 30, 40]
[30, 30, 50]
[30, 40, 40]
...

My current version is only varying one element at a time, thus, as soon as the distance is above 1, I'm missing a lot of combination.

def generate_close_by(possibles, desired):
    for k in range(1, 4):
        for i, elt in enumerate(desired):
            id = possibles.index(elt)

            new = desired[:]
            if id < len(possibles)-k-1:
                new[i] = possibles[id+k]
                yield (new)

            if id > k:
                new[i] = possibles[id-k]
                yield (new)

# Output
[30, 30, 40]
[20, 40, 40]
[20, 30, 50]
[20, 30, 30]
[40, 30, 40]
[20, 50, 40]
[20, 30, 60]
[50, 30, 40]
[20, 60, 40]
[20, 30, 70]

I'm quite sure a module should already exist to do this kind of iteration (itertools?), could you point me to the write function?

Thanks.

EDIT:

Update on the tries...

I'm trying to generate a list the same size of desired in which each element correspond to how much I have to move the element of desired.

desired = [20, 30, 40]
# Distance of 1:
distance = [1, 0, 0]
distance = [0, 1, 0]
distance = [0, 0, 1]
distance = [-1, 0, 0]
distance = [0, -1, 0]
distance = [0, 0, -1]

And then plan was to try to create the new list, and if it can't (out of bounds), it just continues. Not working yet, but might be a good approach.

Mathieu
  • 5,410
  • 6
  • 28
  • 55

4 Answers4

1

Yes you are right, itertools is gonna be really useful here. What you want is find all subsets of desired length of the possibles list WITH duplicates, and the function that does that is itertools.product

from itertools import product

possibles = [20, 30, 40, 50, 60, 70, 80, 100]
desired = [20, 30, 40]

def fake_hamming(cur, desired, possibles):
    assert len(cur) == len(desired)

    hamm = 0
    for i in range(len(cur)):
        assert cur[i] in possibles
        assert desired[i] in possibles
        hamm += abs(possibles.index(cur[i]) - possibles.index(desired[i]))

    return hamm

def generate_close_by(desired, possibles, dist):
    all_possible_lists = product(possibles, repeat=len(desired))
    return [l for l in all_possible_lists if fake_hamming(l, desired, possibles) == dist]

print(generate_close_by(desired, possibles,1))
>>> [(20, 20, 40), (20, 30, 30), (20, 30, 50), (20, 40, 40), (30, 30, 40)]

Edit There you go, changed combinations for product (see @tobias_k comment below), and also here is the fake_hamming function xD Also it is true that it will be slow for big lists, but this is the most generic way to do it

Zuma
  • 806
  • 1
  • 7
  • 10
  • That is the part I've got, I've got some issues with the hamming distance sadly. – Mathieu Jul 02 '18 at 09:30
  • For small lists for combinations, this is probably the simplest, but for longer lists (say 100 possibilities and 5 elements in desired) this quickly becomes infeasible. – tobias_k Jul 02 '18 at 09:43
  • Oh wait I just re read your question, that's not a real hamming distance that you want – Zuma Jul 02 '18 at 09:44
  • @Zuma maybe... I'm not really sure about what it is ^^' – Mathieu Jul 02 '18 at 09:44
  • @tobias_k possibles is quite small, < 20 elts, and desired up to 6 elts. Issue is that I'm going to perform this computation of close-by on thousands of elements. – Mathieu Jul 02 '18 at 09:45
  • Hamming distance counts the number of different elements in two sequences ;) – Zuma Jul 02 '18 at 09:45
  • @Zuma So it's a bit more complex indeed. – Mathieu Jul 02 '18 at 09:46
  • And your outputs can contain several times the same element ? – Zuma Jul 02 '18 at 09:46
  • @Mathieu Even then it's quite a list to filter: 20**6 is 64,000,000. (Also, it should not be `combinations` but `product`, as elements can appear more then once, and order does matter.) – tobias_k Jul 02 '18 at 09:47
  • @tobias_k Not an issue. I need a generator so that I can break out early. True I need product. – Mathieu Jul 02 '18 at 09:51
  • @Mathieu see my edited answer, it does what you want I think – Zuma Jul 02 '18 at 09:55
  • @Mathieu So does this answer your question ? – Zuma Jul 02 '18 at 10:12
1
def distribute(number, bucket):
  if bucket == 1:
    yield [number]
    if number != 0:
      yield [-1 * number]
  elif number == 0:
    yield [0]*bucket
  else:
    for i in range(number+1):
      for j in distribute(number-i, 1):
        for k in distribute(i, bucket-1):
          yield j+k

def generate(possibles, desired, distance):
  for index_distance_tuple in distribute(distance, len(desired)):
    retval = desired[:]
    for i, index in enumerate(index_distance_tuple):
      if index + i < 0 or index + i >= len(possibles):
        break
      retval[i] = possibles[index + i]
    else:
      yield retval

For distance 1:

for i in generate(possibles, desired, 1):
  print(i)

Output :

[30, 30, 40]
[20, 40, 40]
[20, 20, 40]
[20, 30, 50]
[20, 30, 30]

For distance 2 :

for i in generate(possibles, desired, 2):
  print(i)

Output :

[40, 30, 40]
[30, 40, 40]
[30, 20, 40]
[30, 30, 50]
[30, 30, 30]
[20, 50, 40]
[20, 40, 50]
[20, 40, 30]
[20, 20, 50]
[20, 20, 30]
[20, 30, 60]
[20, 30, 20]
Seljuk Gulcan
  • 1,826
  • 13
  • 24
  • Misunderstanding somewhere. [30, 30, 40] is indeed at a distance of 1 since the value 20 is changed to 30 (1 changed to a value at a distance of 1 in possibles). But the following are at further distance. [50, 30, 40] is at d = 2, [50, 30, 40] at a distance of 3.. – Mathieu Jul 02 '18 at 09:41
  • I see, I completely misunderstood. I will edit my answer. – Seljuk Gulcan Jul 02 '18 at 09:44
  • @Mathieu, I edited the answer. Can you check if it is what you want? – Seljuk Gulcan Jul 02 '18 at 10:34
1

I guess I will show a more long winded approach that can be more easily generalizable.

I first write down the problem.

possible_pts = [20, 30, 40, 50, 60, 70, 80, 100]
starting_pt_in_idx = [0, 1, 2]
distance = 2

There are 3 axes that can "change". I first find the combinations of axis changes.

N = len(starting_pt_in_idx)
axis = list(range(N))

import itertools
axismoves = list(itertools.combinations_with_replacement(axis, distance))
print(axismoves)

Next, we bin it. For example, if I see axis-0 appearing twice, it becomes [2,0,0].

abs_movements = []
for combi in axismoves:
    move_bin = [0] * N
    for i in combi:
        move_bin[i] += 1
    abs_movements.append(move_bin)
print(abs_movements)

The above gave the absolute movements. To find the actual movement, we must take into consideration the change can be positive or negative along that axis.

import copy
actual_movements = []
for movement in abs_movements:
    actual_movements.append(movement)
    for i, move in enumerate(movement):
        if move != 0:
            _movement = copy.deepcopy(movement)
            _movement[i] = - move
            actual_movements.append(_movement)
print(actual_movements)

The final step is to translate the index into actual positions. So first we write this helper function.

def translate_idx_to_pos(idx_vect, points):
    idx_bounds = [0, len(points) - 1]
    pos_point = [0] * len(idx_vect)
    for i, idx_pos in enumerate(idx_vect):
        if idx_pos < idx_bounds[0] or idx_pos > idx_bounds[1]:
            return None
        else:
            pos_point[i] = points[idx_pos]
    return pos_point

Using the actual movements to act on the starting point index, then translating it back to the positions.

from operator import add
final_pts = []
for movement in actual_movements:
    final_pt_in_idx = list(map(add, starting_pt_in_idx, movement))
    final_point = translate_idx_to_pos(final_pt_in_idx, possible_pts)
    if final_point is not None:
        final_pts.append(final_point)

print(final_pts)

This gives

[40, 30, 40]
[30, 40, 40]
[30, 20, 40]
[30, 30, 50]
[30, 30, 30]
[20, 50, 40]
[20, 40, 50]
[20, 20, 50]
[20, 40, 30]
[20, 30, 60]
[20, 30, 20]
Spinor8
  • 1,587
  • 4
  • 21
  • 48
0

You could try a recursive approach: Keep track of the remaining distance and generate combinations of just the relevant elements.

def get_with_distance(poss, des, dist, k=0):
    if k < len(des):
        i = poss.index(des[k])
        for n in range(-dist, dist+1):
            if 0 <= i + n < len(poss):
                for comb in get_with_distance(poss, des, dist - abs(n), k+1):
                    yield [poss[i + n]] + comb
    elif dist == 0:
        yield []

This can still run into a few "dead-ends" if there is still dist remaining, but the des list is empty, but overall, this will check much fewer combinations than generating all the combinations up-front and checking their distance afterwards.

If the list of possible elements is longer, you might want to create a dict mapping elements to their index first, so you don't have to do poss.index(first) each time.

Example:

possibles = [20, 30, 40, 50, 60, 70, 80, 100]
desired = [20, 30, 40]
for x in get_with_distance(possibles, desired, 2):
    print(x)

Output:

[20, 20, 30]
[20, 20, 50]
[20, 30, 20]
[20, 30, 60]
[20, 40, 30]
[20, 40, 50]
[20, 50, 40]
[30, 20, 40]
[30, 30, 30]
[30, 30, 50]
[30, 40, 40]
[40, 30, 40]
tobias_k
  • 81,265
  • 12
  • 120
  • 179