2

So I have a problem dealing with permutations of lists/strings, which I am having a hard time solving.

So, say I have several Lists:

list1 = ["a"]
list2 = ["a","b","c","d"]
list3 = ["b","e"]
list4 = ["f","g","a"]

I need to calculate the number of all possible combinations of permutations while choosing 1 character from each list. So, from the first list, I choose a character. "a", in this case since there is only one item in the list. Next I select an item from the second list, but it CAN'T BE "a", as that was chosen in my previous list, so it could be "b", "c", or "d". Next I choose an item from the third list, and if I chose "a" in the first, and "b", in the second, I could only choose "e", as "b" was already used previously. The same goes for the fourth list.

So I need to calculate all of the possible combinations of unique character combinations from my lists. Hopefully everyone gets what I'm asking here. Or if possible, I don't even need to create the lists of permutations, I just need to calculate HOW MANY combinations there are total. Whatever would be less memory intensive as there may be a large number of individual lists in the actual problem

To be more verbose with my question... If I had two lists: list1 = ["a"] list2 = ["b"]

There would only be one combination, as you preserve the location in the permuted strings. List one does not contain a b, so the only combination could be ("a","b"), not ("b","a"). And to further extends the constraints of this question .. I don't necessarily want to retrieve the results of all the permutations, I want to only return the TOTAL NUMBER of possible permutations. Returning the results takes up too much memory, as I will be working with rougly fifteen lists, of 1 to 15 characters in each list.

Michael Scott
  • 539
  • 3
  • 8
  • 18
  • 1
    To settle the skirmish in the comments: say we had only `list1 = ['a']` and `list2 = ['b']`. Do you want the total count to be 1, as there's only `('a','b')` as a valid option, or 2, because you start from `('a','b')` and then permute it, getting `('a','b'), ('b', 'a')`? – DSM Jan 11 '15 at 00:33

3 Answers3

2

Use itertools.product to generate all possible combinations from the lists. Then, using itertools.ifilter, filter out all combinations that contain a repeated character. One simple way to do this is to check if the length of the list stays the same if you remove all duplicates (i.e. if you create a set from it).

import itertools

list1 = ["a"]
list2 = ["a","b","c","d"]
list3 = ["b","e"]
list4 = ["f","g","a"]

f = lambda x: len(x) == len(set(x))
it = itertools.ifilter(f, itertools.product(list1, list2, list3, list4))

# print all combinations
for combination in it:
    print combination
Carsten
  • 17,991
  • 4
  • 48
  • 53
  • `AttributeError: 'module' object has no attribute 'ifilter'` – GLHF Jan 11 '15 at 00:18
  • last line should be `print combination`. – Mark Tolonen Jan 11 '15 at 00:18
  • @howaboutNO, He's using Python 2. – Mark Tolonen Jan 11 '15 at 00:18
  • 1
    @howaboutNO For use with Python 3, you should be fine if you replace `itertools.ifilter` with `filter` and put parentheses around the `print` line. – Carsten Jan 11 '15 at 00:24
  • 2
    @Carsten where is the permutation of `a,b,c,d` ? Your answer is not correct. OP wants ALL permutations. – GLHF Jan 11 '15 at 00:26
  • 2
    @howaboutNO: `a,b,c,d` wouldn't be valid because the third letter has to come from `list3` (b or e, although it can't be b here because that would already have been used). Carsten is correct. As the OP says: "Next I choose an item from the third list". – DSM Jan 11 '15 at 00:27
  • 1
    @howaboutNO, the OP only wants perms where you take one item from every list – Padraic Cunningham Jan 11 '15 at 00:27
1

Use itertools.product. It iterates through all permutations of choosing one item for each list. Additionally, use a list comprehension to eliminate the iterations that don't meet your requirements.

>>> a='a'
>>> b='abcd'
>>> c='be'
>>> d='fga'
>>> import itertools
>>> [a+b+c+d for a,b,c,d in itertools.product(a,b,c,d) if b != a and c not in [a,b] and d not in [a,b,c]]
['abef', 'abeg', 'acbf', 'acbg', 'acef', 'aceg', 'adbf', 'adbg', 'adef', 'adeg']
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
1

You can cache counts of the form "starting from the i'th list, excluding elements in S". By being careful to limit S to only characters that may be excluded (that is, only elements that appear in a later list), you can reduce the amount of repeated computation.

Here's an example program:

def count_uniq_combs(sets, i, excluding, cache):
    if i == len(sets): return 1
    key = (i, excluding)
    if key in cache:
        return cache[key]
    count = 0
    for c in sets[i][0]:
        if c in excluding: continue
        newx = (excluding | set([c])) & sets[i][1]
        count += count_uniq_combs(sets, i + 1, newx, cache)
    cache[key] = count
    print key, count
    return count

def count(xs):
    sets = [[set(x)] for x in xs]
    # Pre-compute the union of all subsequent sets.
    union = set()
    for s in reversed(sets):
        s.append(union)
        union = union | s[0]
    return count_uniq_combs(sets, 0, frozenset(), dict())

print count(['a', 'abcd', 'be', 'fga'])

It prints out the values it's actually calculating (rather than recalling from the cache), which looks like this:

(3, frozenset(['a'])) 2
(2, frozenset(['a'])) 4
(2, frozenset(['a', 'b'])) 2
(1, frozenset(['a'])) 10
(0, frozenset([])) 10

For example, when looking at list 2 ("b", "e") there's only two counts computed: one where "a" and "b" are both excluded, and one where only "a" is excluded. Compare this to the naive implementation where you'd also be counting many other combinations (for example: "a" and "c").

If still isn't fast enough, you can try heuristics for sorting the lists: you want lists which contain relatively few symbols of other lists to come later.

Paul Hankin
  • 54,811
  • 11
  • 92
  • 118