0

I need to get all possible combinations based on n-number of input lists and do some stuff to them.

current code example:

import itertools

# example inputs
list_small = [1, 2, 3]
list_medium = [444, 666, 242]
list_huge = [1680, 7559, 5573, 43658, 530, 11772, 284, 50078, 783, 37809, 6740, 37765, 74492, 50078, 783, 37809, 6740, 37765, 74492]

# out of the input list, I need to generate all numbers from 0 to the current list element
# e.g. if I have 6, I need to get [0, 1, 2, 3, 4, 5, 6]
# if I get a list [1, 2, 3], the output will be [[0, 1], [0, 1, 2], [0, 1, 2, 3]]
# I achieved this by doing it with xrange: [x for x in xrange(0, current_list_element + 1)]
# after that, I need to generate all possible combinations using the generated lists
# I managed to do this by using itertools.product()

# print this to get all possible combinations
# print list(itertools.product(*[[x for x in xrange(0, current_list_element + 1)] for current_list_element in list_medium]))

cumulative_sum = 0
for current_combination in itertools.product(*[[x for x in xrange(0, current_list_element + 1)] for current_list_element in list_medium]):
    # now I need to do some calculations to the current combination
    # e.g. get sum of all combinations, this is just an example
    cumulative_sum += sum(current_combination)

    # another example
    # get XOR sum of current combination, more at https://en.wikipedia.org/wiki/Exclusive_or
    print reduce(operator.xor, current_combination, 0)

# runs fast for list_small, then takes some time for list_medium and then takes ages for list_huge
print cumulative_sum

This works fine for smaller lists, but takes infinity for larger lists / or throws Runtime Error. Is there any better way to do this? Better way to get all combinations? Or am I using xrange in some wrong way?

I tried this with Python 2.7 and Pypy 2.

EDIT: thanks to @famagusta I got rid of xrange, but the problem still remains

import itertools

# example inputs
list_small = [1, 2, 3]
list_medium = [444, 666, 242]
list_huge = [1680, 7559, 5573, 43658, 530, 11772, 284, 50078, 783, 37809, 6740, 37765, 74492, 50078, 783, 37809, 6740, 37765, 74492]

max_element = max(get_input_stones)
combo_list = range(0, max_element + 1)

cumulative_sum = 0
for current_combination in itertools.product(*combo_list):
    # now I need to do some calculations to the current combination
    # e.g. get sum of all combinations, this is just an example
    cumulative_sum += sum(current_combination)

    # another example
    # get XOR sum of current combination, more at https://en.wikipedia.org/wiki/Exclusive_or
    print reduce(operator.xor, current_combination, 0)

# runs fast for list_small, then takes some time for list_medium and then takes ages for list_huge
print cumulative_sum
Ivan Bilan
  • 2,379
  • 5
  • 38
  • 58
  • 1
    it seems it fits better in code review than in SO if your solution is already working. just curious why you need the `0` as product of it will always be 0 and won't have any input to your cumulative_sum? – Anzel Aug 25 '16 at 09:41
  • 1
    If I'm understanding you correctly, with ``list_huge`` you are looking at 4326103124078513425142526919770571037551206383570394447976698504265728000000 combinations. Any algorithm that needs that many steps is entirely doomed... – Armin Rigo Aug 25 '16 at 09:48
  • @Anzel that was just an example, I have added a new one with an XOR sum in which I need all combinations – Ivan Bilan Aug 25 '16 at 10:27
  • @Armin Rigo, yea, there must be some workaround – Ivan Bilan Aug 25 '16 at 10:34
  • 1
    @ivan_bilan: we can't start thinking about a workaround if we don't know what you are computing at every iteration. If you want a solution that works whatever you are computing, then I can only answer "it will never work, there are far too many iterations". – Armin Rigo Aug 26 '16 at 07:22

1 Answers1

1

Generating such nested lists could get you into trouble with memory limitations. Instead of repeatedly generating sublists, you can use just one super list generated from the largest number in the list. Just store the indices where smaller elements would have stopped.

For e.g., [1, 6, 10] - [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [1, 6, 10]

The second list tells you where to stop in the first list to extract sublists of interest for computation

This should save you some space.

list_small = [1, 2, 3]
list_medium = [444, 666, 242]
list_huge = [1680, 7559, 5573, 43658, 530, 11772, 284, 50078, 783, 37809, 6740, 37765, 74492, 50078, 783, 37809, 6740, 37765, 74492]

max_element = max(list_huge)   # being lazy here - write a max function
combo_list = range(0, max_element + 1)  # xrange does not support slicing

cumulative_sum = 0
for element in list_huge:
    cumulative_sum += sum(combo_list[:element])

print(cumulative_sum)
Jordan Jambazov
  • 3,460
  • 1
  • 19
  • 40
famagusta
  • 160
  • 8
  • 1
    "# being lazy here - write a max function" How about `max`? :-) – Lucas Moeskops Aug 25 '16 at 10:04
  • 1
    lol!! yes!! weird, my editor didn't show that. must investigate :P – famagusta Aug 25 '16 at 10:18
  • Thanks, using sum was just an example, I have added another example where your approach will not work, since you do not generate all combinations – Ivan Bilan Aug 25 '16 at 10:22
  • ok, I have added your solution to generating the lists instead of using xrange. The algorithm is still slow, I think itertools.product is the problem here – Ivan Bilan Aug 25 '16 at 10:49
  • the way itertools.product(A, B) works is that it generate nested for loops for (x, y) in A for y in B doing this for a list of lists would result in very deep nesting. A general approach cannot be foreseen, could you tell us exactly the computation you need for further optimization? – famagusta Aug 25 '16 at 11:06