7

I'm looking for help on the following problem. I have a small program that is part of a much larger program, I need to loop through every combination of an array of number from 1 to 10 (maybe more or less) in the same way itertools works. However because I have certain restrictions, I have a need to skip a large number of these combinations to save time as this could potentially get very large.

Here is my program

combination = [-1, -1, -1, -1]
len_combination = len(combination)

max_at_index = [0, 2, 2, 1, 2, 1, 2, 1, 3, 1]
len_index = len(max_at_index)

end = 0


def skip(depth):

    combination[depth] = combination[depth] + 1
    if combination[depth] == len_index:
        combination[depth] = 0

    for x in range(0, len_index):
        if combination[:depth + 1].count(x) > max_at_index[x]:

            return True

    return False


for i in range(0, len_index):

    if skip(0):
        continue

    for j in range(0, len_index):

        if skip(1):
            continue

        for k in range(0, len_index):

            if skip(2):
                continue

            for l in range(0, len_index):

                if skip(3):
                    continue

                print(combination)

This example has 4 items each looping from 0 to 9, ([0, 0, 0, 0] to [9, 9, 9, 9]). However my variable max_at_index is restricting the count of values allowed in the array at each index. Here we are allowed 0 0s, 2 1s, 2 2s, 1 3s etc. This is working well and I can even expand or shrink the max_at_index array.

What I cant figure out how to do is make the nested for loop recursive so I can expand or shrink the size of combination to have more or less elements.

Thank you in advance.

EDIT: As requested, some explanation to my logic

Consider the following list of costs

[
[1, 2, 3, 4, 5, 6, 0, 8, 9],
[10, 11, 12, 0, 14, 15, 16, 17, 18, 19],
[0, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[30, 0, 32, 33, 34, 35, 0, 37, 38, 0]
]

I have to generate the smallest possible total when picking one number from each array where...

  • The number from each array can not be 0
  • The index of each chosen number can not exceed a given limit (i.e no more than 3 from index 2)
  • The number of numbers chosen from index 0 must reach a limit (for example 2 must come from index 0) or the next highest possible.

This part I have figured out too. If I loop each and every single possible combination from 0,0,0,0 to 9,9,9,9 I can test to see if it meets the above. I just need to avoid looping every combination as most of them will be useless and it will grow large

xn1
  • 417
  • 4
  • 12
  • Where do you use i,j,k,l in your loops ? – Born Tbe Wasted Apr 01 '19 at 09:53
  • Is order important in your case? That is, do you need to produce, for example, both `(0, 2, 1, 2)` and `(2, 2, 1, 0)` as different combinations, or there should be no produced tuples with the same elements? – jdehesa Apr 01 '19 at 09:53
  • When i first wrote this, the skip function was a part of each loop and not a function at all, they were used within there and redundant now – xn1 Apr 01 '19 at 09:56
  • Duplicates are important, each of the combinations points to a index in another array. I have an array of 4 arrays with 10 elements each. This builds every combination of values from these arrays (1 from each of the 4) – xn1 Apr 01 '19 at 09:58
  • i would suggest using itertools nevertheless (probably multiple calls to its functions) in combination with the right arguments. This way you should be able to not only set an offset but also a limit. You could also generate combinations with itertools and subsequently filter it's output with your custom constraint. – fabianegli Apr 01 '19 at 09:58
  • @fabianegli My overall problem is a knapsack problem where I could be looping up to 10**10 combinations, can itertools handle this? – xn1 Apr 01 '19 at 10:00
  • If you could define a `final condition` what would it be? Since turning your loop into a recursive function, comes with a final condition. In what condition do you want your function to print your combination? – Farhood ET Apr 01 '19 at 10:00
  • @xn1 sure, itertools handles that. The question is rather whether your computer does. Just try and see :-) No need to "optimize" until you know it doesn't work for you. – fabianegli Apr 01 '19 at 10:03
  • @FarhoodET I have updated my question with my end result. – xn1 Apr 01 '19 at 10:10
  • @fabianegli It's running on a powerful server that's not being used for anything else. Power is not a problem and to some extent neither is time but if I can reduce it that would be ideal – xn1 Apr 01 '19 at 10:11
  • i believe your constraints are ambiguous because there is no explicit rationale to which of the minima at a given index are chosen if more than the number given by constraint are present. – fabianegli Apr 01 '19 at 10:21
  • @fabianegli Apologies. Every day we have a list of jobs that contractors can complete each for their own calculated cost, this example we have 9 contractors. Index 0 is our own employees who are available, If we don't use them we waste money, for our own staff there is a fixed cost plus a variable cost so each job will vary in cost for us too. Of the other contractors their availability changes each day, where not available the cost returns 0 and each has a maximum number of jobs they can take on each day. So I have to find the lowest combination to distribute the jobs to – xn1 Apr 01 '19 at 10:32
  • I posted a solution for a similar problem on https://stackoverflow.com/a/55444186/2352026. Instead of finding an enumeration algorithm, you can express your problem as a linear program and then ask a solver to find the optimal solution. – sdementen Apr 07 '19 at 08:53

4 Answers4

3

I think this is one possible implementation:

def bounded_comb(max_at_index, n):
    yield from _bounded_comb_rec(max_at_index, n, [0] * len(max_at_index), [])

def _bounded_comb_rec(max_at_index, n, counts, current):
    # If we have enough elements finish
    if len(current) >= n:
        yield tuple(current)
    else:
        # For each index and max
        for idx, m in enumerate(max_at_index):
            # If the max has not been reached
            if m > counts[idx]:
                # Add the index
                counts[idx] += 1
                current.append(idx)
                # Produce all combinations
                yield from _bounded_comb_rec(max_at_index, n, counts, current)
                # Undo add the index
                current.pop()
                counts[idx] -= 1

# Test
max_at_index = [0, 2, 1, 3]
n = 4
print(*bounded_comb(max_at_index, n), sep='\n')

Output:

(1, 1, 2, 3)
(1, 1, 3, 2)
(1, 1, 3, 3)
(1, 2, 1, 3)
(1, 2, 3, 1)
(1, 2, 3, 3)
(1, 3, 1, 2)
(1, 3, 1, 3)
(1, 3, 2, 1)
(1, 3, 2, 3)
(1, 3, 3, 1)
(1, 3, 3, 2)
(1, 3, 3, 3)
(2, 1, 1, 3)
(2, 1, 3, 1)
(2, 1, 3, 3)
(2, 3, 1, 1)
(2, 3, 1, 3)
(2, 3, 3, 1)
(2, 3, 3, 3)
(3, 1, 1, 2)
(3, 1, 1, 3)
(3, 1, 2, 1)
(3, 1, 2, 3)
(3, 1, 3, 1)
(3, 1, 3, 2)
(3, 1, 3, 3)
(3, 2, 1, 1)
(3, 2, 1, 3)
(3, 2, 3, 1)
(3, 2, 3, 3)
(3, 3, 1, 1)
(3, 3, 1, 2)
(3, 3, 1, 3)
(3, 3, 2, 1)
(3, 3, 2, 3)
(3, 3, 3, 1)
(3, 3, 3, 2)
jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • Thank you for your help. However, this output if different to my output, I'm looking everything from [0,0,0,0] to [n,n,n,n] where n is the length of max_at_index (my example [9,9,9,9]) but skipping all branches where the number of each number is restricted in max_at_index. For example restricting 0s to 2, the first logical starting position is [0,0,1,1].If you run my program it will show what my desired output – xn1 Apr 01 '19 at 10:25
  • 1
    @xn1 If I set `max_at_index = [0, 2, 2, 1, 2, 1, 2, 1, 3, 1]` in my code I get exactly the same output as your code (I changed the input just to be able to print the output). – jdehesa Apr 01 '19 at 10:41
  • After testing I am marking this as the answer, I tested yours against the answer from Born Tbe Wasted and found yours faster when the length of combinations exceeded 6 elements which is my case is most likely to happen. Thank you for your assistance – xn1 Apr 01 '19 at 11:05
  • Sorry to ask so much of you @jdehesa but since I have 32 cores to process on I was wondering about multi processing this portion. I am familiar with multi processing so that's not a issue, I was wondering if it's possible to split the loop into smaller loops, for example first processor handles loops 0,0,0,0 to 0,9,9,9. The next does 1,0,0,0 to 1,9,9,9 all under the same conditions? The idea being that each returns it's best result and the best best result is then found from those. – xn1 Apr 01 '19 at 12:02
  • @xn1 If you want to compute a value for each combination and then pick the maximum one, you can use the iterator `bounded_comb` with a [process pool](https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool) (e.g. method `map`) to compute the value and take the `max`. If you want to group the combinations in chunks and produce a single value for each, you can do the same but using something like `grouper` from [iterools recipes](https://docs.python.org/3/library/itertools.html#itertools-recipes). Or do you want to parallelize the generation of the combinations? – jdehesa Apr 01 '19 at 13:36
  • I'm here at the moment https://github.com/Xn1ch1/Costing-Calculator/blob/master/Looping%20Model I accidentally stumbled on a way to restrict the range of the loop. The only issue now is the first item in the combination is being ignored with restrictions, but this runs so fast I may be able to deal with that separately – xn1 Apr 01 '19 at 15:16
  • @xn1 I'm not sure I understand how this `bound` parameter works in your code... If you just want to set the first value of the combinations to produce, that is more or less right but then you would either have to change the `available` parameter on each process (decrease 1 for the `bound` index) or change the initial `current` (`[0] * len(available)`) to have a 1 at `bound` position. Also, you need to make sure not to generate combinations for positions where the initial availability is 0. – jdehesa Apr 01 '19 at 15:28
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/191030/discussion-between-xn1-and-jdehesa). – xn1 Apr 01 '19 at 15:32
1

this is a try where i restrict construct a pool of values to select from (select_from) and then build the combinations:

from itertools import chain, combinations

max_at_index = [0, 2, 2, 1, 2, 1, 2, 1, 3, 1]

select_from = list(chain.from_iterable(n * [i] for i, n in enumerate(max_at_index)))
# [1, 1, 2, 2, 3, 4, 4, 5, 6, 6, 7, 8, 8, 8, 9]

for comb in combinations(select_from, 4):
    print(comb)

this produces the combinaions sorted. if you need all the permutations as well, you need to do that afterwards (i use the set 'seen' here in order to avoid duplicates):

from itertools import chain, combinations, permutations

seen_comb = set()

select_from = list(chain.from_iterable(n * [i] for i, n in enumerate(max_at_index)))

for comb in combinations(select_from, 4):

    sorted_comb = tuple(sorted(comb))
    if sorted_comb in seen_comb:
        continue
    seen_comb.add(sorted_comb)

    seen_perm = set()

    for perm in permutations(comb):
        if perm in seen_perm:
            continue
        seen_perm.add(perm)

        print(perm)
hiro protagonist
  • 44,693
  • 14
  • 86
  • 111
  • This is almost there, it seems to be missing of the any combination starting with a 9 – xn1 Apr 01 '19 at 10:34
  • This is looking promising indeed thank you. I'm just going to test a few times and report back – xn1 Apr 01 '19 at 10:39
  • Using the second one, I'm getting a lot of duplicate results, for example combinations(select_from, 2) the first 5 output contains 2 of the same. Is that something I've done wrong? – xn1 Apr 01 '19 at 10:43
  • 1
    you are right... the `seen` needs to be 'global'... fixed, – hiro protagonist Apr 01 '19 at 11:12
  • Yes, that works thank you. I will keep with the marked answer though, itertools seems to be a slow option here, roughly 8 times slower than marked answer under the same conditions. – xn1 Apr 01 '19 at 11:18
  • 1
    updated a bit. this will iterate over fewer permutations now.but the accepted answer looks better indeed... – hiro protagonist Apr 01 '19 at 11:20
1

I didn't want to show anything fancy, but to give you the simplest answer for recursive loop (as that was your question)

combination = [-1, -1, -1, -1]
len_combination = len(combination)
max_depth = 3
max_at_index = [0, 2, 2, 1, 2, 1, 2, 1, 3, 1]
len_index = len(max_at_index)

end = 0

def skip(depth):

    combination[depth] = combination[depth] + 1
    if combination[depth] == len_index:
        combination[depth] = 0

    for x in range(0, len_index):
        if combination[:depth + 1].count(x) > max_at_index[x]:

            return True,combination # Needs to return the state of combination

    return False,combination # Needs to return the state of combination

def loop(depth,combination):
    if depth == max_depth:
        boolean, combination = skip(depth)
        if not(boolean):
            print (combination)
            return combination
    else:
        for i in range(0, len_index):
            boolean, combination = skip(depth)
            if not(boolean):
                loop(depth+1,combination)

loop(0,combination)
Born Tbe Wasted
  • 610
  • 3
  • 13
  • This seems to be working very very fast indeed, I shall test a bit and get back to you thank you. – xn1 Apr 01 '19 at 10:46
  • Thank you for your assistance. Whist your answer worked perfectly and in some cases faster than the marked answer, it was slower where max_depth was 7 or more, which is my use case is most likely to happen. – xn1 Apr 01 '19 at 11:06
1

sympy also provides everything you need:

from sympy.utilities.iterables import multiset_permutations


max_at_index = [0, 2, 2, 1, 2, 1, 2, 1, 3, 1]
m_set = {i: n for  i, n in enumerate(max_at_index) if n != 0}

for perm in multiset_permutations(m_set, 4):
    print(perm)

Explanation:

the datatype this is based on is a multiset (i.e. a set where elements may appear more than once, but the order does not matter). there is a function for such a data structure in sympy: sympy.utilities.iterables.multiset

from itertools import chain
from sympy.utilities.iterables import multiset

max_at_index = [0, 2, 2, 1, 2, 1, 2, 1, 3, 1]
m_set = multiset(chain.from_iterable(n * [i] for i, n in enumerate(max_at_index)))
# {1: 2, 2: 2, 3: 1, 4: 2, 5: 1, 6: 2, 7: 1, 8: 3, 9: 1}

actually multiset just returns a dict; therefore this is simpler:

m_set = {i: n for  i, n in enumerate(max_at_index) if n != 0}
# {1: 2, 2: 2, 3: 1, 4: 2, 5: 1, 6: 2, 7: 1, 8: 3, 9: 1}

fortunately sympy also has the methods to permute and combine those multisets without generating any repetition:

from sympy.utilities.iterables import multiset_permutations

for perm in multiset_permutations(m_set, 4):
    print(perm)

in order to help with parallelizing, calculating the combinations first may help:

from sympy.utilities.iterables import multiset_combinations, multiset_permutations

for comb in multiset_combinations(m_set, 4):
    print()
    for perm in multiset_permutations(comb):
        print(perm)

which produces (added a space after every new combination)

[1, 1, 2, 2]
[1, 2, 1, 2]
[1, 2, 2, 1]
[2, 1, 1, 2]
[2, 1, 2, 1]
[2, 2, 1, 1]

[1, 1, 2, 3]
[1, 1, 3, 2]
[1, 2, 1, 3]
[1, 2, 3, 1]
[1, 3, 1, 2]
[1, 3, 2, 1]
[2, 1, 1, 3]
[2, 1, 3, 1]
[2, 3, 1, 1]
[3, 1, 1, 2]
[3, 1, 2, 1]
[3, 2, 1, 1]

...

[8, 8, 8, 9]
[8, 8, 9, 8]
[8, 9, 8, 8]
[9, 8, 8, 8]
hiro protagonist
  • 44,693
  • 14
  • 86
  • 111
  • This is promising, do you think it's possible to control the start and end point to the loops. Where I am currently is (using the accepted answer) multi processing the loops (server has 32 cores). My test conditions are max_at_index = [2, 2, 2, 2, 2, 2, 2], with a combination length of 8, on each loop increase a counter by 1. This version takes 6.42 seconds. Using multi processing on the other answer (each process created loops [n, 0, 0, 0, 0, 0, 0, 0] through [n, 9, 9, 9, 9, 9, 9, 9]) takes 1.09 seconds. – xn1 Apr 01 '19 at 15:03
  • you could try to process the permutations in parallel. the number of permutations range from 2520 to 20160... you could also group some of the combinations together before starting a thread to get the permutations. i would not know how to tweak the start and end points.... – hiro protagonist Apr 01 '19 at 15:13
  • some stats (may help parallelizing this): there are 357 combinations (each with 2520 to 20160 permutations) which is a total of 2346120. – hiro protagonist Apr 01 '19 at 15:17
  • My ultimate benchmark is a combination length of 10 and an max_index length of 7. If each max_index was set to 7 that would be 282,475,249 combinations, which thankfully will never be a situation, each being 2 is a more realistic case. To which I've no idea how many combinations there are. Currently with this file https://github.com/Xn1ch1/Costing-Calculator/blob/master/Looping%20Model I'm utilizing at most 17% of CPU. So I'm eager to split the loops onto more processors – xn1 Apr 01 '19 at 15:27