0

I am using the following function to find the subsets of a list L. However, when converting the output of the function powerset into a list it takes way too long. Any suggestion?

For clarification, this powerset function does not output the empty subset and the subset L itself (it is intentional).

My list L:

L = [0, 3, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107]

The code:

def powerset(s):
    x = len(s)
    masks = [1 << i for i in range(x)]
    for i in range(1, (1 << x)-1):
        yield [ss for mask, ss in zip(masks, s) if i & mask]

my_Subsets = list(powerset(L)) # <--- THIS TAKES WAY TOO LONG
gapansi
  • 105
  • 2
  • 10
  • Yeah, I am aware. I was wondering if there are other options to make this process faster. – gapansi Dec 18 '20 at 13:37
  • Because this is a classic case of combinatorial explosion. Your set has cardinality 55, so there are 2**55 == 36028797018963968 items in the powerset – juanpa.arrivillaga Dec 18 '20 at 13:38
  • 2
    There is no "magic" you can do here. Even if generating each item took 1 *nanosecond* (which is definitely much faster than is actually the case), then it would require 3.603×10^7 seconds seconds, which is around 417 years. Again, that's assuming a speed that is totally unrealistic to begin wiht – juanpa.arrivillaga Dec 18 '20 at 13:41
  • Why do you need to convert it into a list? Just let it be a generator, and everything is fine :) – Jussi Nurminen Dec 18 '20 at 13:41
  • @JussiNurminen no, **everything is not fine**. – juanpa.arrivillaga Dec 18 '20 at 13:42
  • What do you want do with the result? If you want to e.g. print some of the subsets, it can be achieved with the generator. But nothing be achieved by converting it to a list. – Jussi Nurminen Dec 18 '20 at 13:42
  • 2
    @gapansi What problem are you actually trying to solve by generating such a large list? Why do you think you need to keep the whole list in memory? – ekhumoro Dec 18 '20 at 13:43
  • I would need to iterate over that list. I am trying to solve a VRP problem and I was basically using the book formulation for solving such problems. However, I guess I would need to find a different way. – gapansi Dec 18 '20 at 13:55

2 Answers2

1

Your set has 55 elements. Meaning 2^55=36028797018963968 subsets.

There's no way, in any language, any algorithm to make that fast. Because for each subset you need at least one allocation, and that single operation repeated 2^55 times will run forever. For example if we were to run one allocation per nanosecond (in reality this is orders of magnitude slower) we are looking at something over a year (if my calculations are correct). In Python probably 100 years. :P

Not to mention that the final result is unlikely to fit in the entire world's data storage (ram + hard drives) currently available. And definitely not in a single machine's storage. And so final list(...) conversion will fail with 100% probability, even if you wait those years.

Whatever you are trying to achieve (this is likely an XY problem) you are doing it the wrong way.

freakish
  • 54,167
  • 9
  • 132
  • 169
  • I think it would help to add a concrete example of how long this is. 2^55 operations, going at 1 operations per nanosecond would still require 417 days to complete. And the operations here would require orders of magnitude more time than a nanoecond – juanpa.arrivillaga Dec 18 '20 at 13:47
  • 1
    @juanpa.arrivillaga I've updated the answer. – freakish Dec 18 '20 at 13:53
0

What you could do is create a class that will behave like a list but would only compute the items as needed and not actually store them:

class Powerset:

    def __init__(self,base):
        self.base = base

    def __len__(self):
        return 2**len(self.base)-2 # - 2 you're excluding empty and full sets

    def __getitem__(self,index):
        if isinstance(index,slice):
            return [ self.__getitem__(i) for i in range(len(self))[index] ]
        else:
            return [ss for bit,ss in enumerate(self.base) if (1<<bit) & (index+1)]


L = [0, 3, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107]


P = Powerset(L)

print(len(P)) # 36028797018963966
print(P[:10]) # [[0], [3], [0, 3], [5], [0, 5], [3, 5], [0, 3, 5], [6], [0, 6], [3, 6]]
print(P[3:6]) # [[5], [0, 5], [3, 5]]
print(P[-3:]) # [[5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107], [0, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107], [3, 5, 6, 8, 9, 11, 13, 16, 18, 19, 20, 23, 25, 28, 29, 30, 32, 33, 35, 36, 38, 42, 43, 44, 45, 49, 50, 51, 53, 54, 56, 57, 62, 63, 64, 65, 66, 67, 71, 76, 78, 79, 81, 82, 84, 86, 87, 90, 92, 96, 97, 98, 100, 107]]

Obviously, if the next thing you do is a sequential search or traversal of the powerset, it will still take forever.

Alain T.
  • 40,517
  • 4
  • 31
  • 51