0

I found this example for finding all strings of an alphabet of a given length.

for i in range(length):
    result = [str(x)+str(y) for x in alphabet for y in result or ['']]

I'm trying to understand how this works and if this was implemented with for loops, how it would look - all my attempts at simplifying it become very messy and crash with infinite loops... while this one works every time.


Example:

def allstrings(alphabet, length):
    """Find the list of all strings of 'alphabet' of length 'length'"""

    alphabet = list(alphabet)

    result = []

    for i in range(length):
        result = [str(x)+str(y) for x in alphabet for y in result or ['']]

    return result

# will return ['aa', 'ab', 'ba', 'bb']
print(allstrings({'a', 'b'}, 2)))

# will return ['000', '001', '010', '011', '100', '101', '110', '111']
print(allstrings({'0', '1'}, 4)))

Code is modified from: http://code.activestate.com/recipes/425303-generating-all-strings-of-some-length-of-a-given-a/

South Paw
  • 1,230
  • 1
  • 9
  • 11

4 Answers4

5
>>> alphabet="abcd"
>>> list(itertools.permutations(alphabet,3))

should take care of finding all the permutations of an alphabet (of word length 3)

Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
1

Basically, you are using what is known as list comprehension which is essentially a backwards for loop that returns a list. Using this technique, you iterate the given letters for the given length joining the strings together.

Malik Brahimi
  • 16,341
  • 7
  • 39
  • 70
1

In short this is equivalent of your code, since I haven't seen others provide you with it. I do recommend the use of itertools like Joran Beasley wrote because they're faster, and they also make a clear and simpler statement.

def permute(alphabet):
    result = []
    for x in alphabet:
        for y in alphabet:
            result.append(str(x)+str(y))
    return result

With ouptut in IDLE as:

>>> alphabet = ["a", "b", "c"]
>>> r = permute(alphabet)
>>> r
['aa', 'ab', 'ac', 'ba', 'bb', 'bc', 'ca', 'cb', 'cc']

However this approach makes it harder to define the desired length. To achieve that effect you would have to do something like:

def permute(original, permutated):
    result = []
    for x in alphabet:
        for y in permutated or [""]:
            result.append(str(x)+str(y))
    return result

def helper(alphabet, length):
    res = []
    for i in range(length):
        res = permute(alphabet, res)
    return res

Output of which now looks like:

>>> h = helper(alphabet, 2)
>>> h
['aa', 'ab', 'ac', 'ba', 'bb', 'bc', 'ca', 'cb', 'cc']
>>> h = helper(alphabet, 3)
>>> h
['aaa', 'aab', 'aac', 'aba', 'abb', 'abc', 'aca', 'acb', 'acc', 'baa', 'bab', 'bac', 'bba', 'bbb', 'bbc', 'bca', 'bcb', 'bcc', 'caa', 'cab', 'cac', 'cba', 'cbb', 'cbc', 'cca', 'ccb', 'ccc']

Can you make out what's happening? Or should I write up an explanation. (but please make an effort first).

ljetibo
  • 3,048
  • 19
  • 25
  • Ahhh I finally understand! Thank you :) Every time it gets the result, it uses it to build the next layer and overwrites the old result. `First run: ['0', '1'] Second run: ['00', '01', '10', '11'] Third run: ['000', '001', '010', '011', '100', '101', '110', '111']` – South Paw Mar 06 '15 at 01:43
  • @SouthPaw That is correct, and if previous result didn't exist, it takes a 1 element list with empty string `[""]` which means it adds nothing. – ljetibo Mar 06 '15 at 01:44
0

It's just a double list comprehension, it's a lot simpler if you think of it like this:

print [a for a in "abc"] # ['a', 'b', 'c']

If you do two for ... in ... things in there, it's just like a nested loop;

result = [XXX for a in as for b in bs for c in cs...]

where XXX is a function of the a, b, c, ... elements of the various containers you have used (this parts after the 'in's) translates to:

result = []
for a in as:
  for b in bs:
    for c in cs:
      ...#however much nesting you want
      result.append(XXX)

You can see this by swapping the orders around:

print [a+b for a in "ab" for b in "xy"] # ax ay bx by
print [a+b for b in "xy" for a in "ab"] # ax bx ay by

Now for your actual example:

for i in range(length):
  result = [str(x)+str(y) for x in alphabet for y in result or ['']]


this just follows the same rules as above, but there is one extra little hitch; the `result or ['']` part. It is possible to have `if ... else ...` style constructs in list comprehensions, though you'll probably want to avoid those for now. That is *not* what's happening here though, the code is a little clearer if you change it to this:

for i in range(length):
  result = [str(x)+str(y) for x in alphabet for y in (result or [''])]

Essentially, it's taking advantage of the fact that [] (i.e. empty list) counts as False when cast to a bool*. So if the list is empty, it will be replaced with ['']. The reason this is necessary, is that if result is an empty string (or array), no iterations through the for loop will happen - an array containing an empty string works just fine though, as it has one element (an empty string) in it.

It is case to a bool because the or operator is used. If you do something like this:

a = var1 or var2

then afterwards, a will be var1 - unless it is one of the objects which counts as False;

print False or "other" # other
print None or "other" # other
print True or "other" # True
print "" or "other" # other
print " " or "other" # ' '
print 0 or "other" # other
print 1 or "other" # 1
print 134513456 or "other" # 134513456
print [] or "other" # other
print [0] or "other" # [0]
...

I'm sure you get the picture.

The other potentially confusing aspect of the code you had was that it setting the variable result, while using the same variable inside the list comprehension. This is okay, because the variable doesn't get set until the list comprehension has completed.

So to convert it into something using normal loops:

for i in range(length):
  result = [str(x)+str(y) for x in alphabet for y in result or ['']]


result = [""]
for i in range(length):
  temporary_result = []
  for original_sequence in result:
    for new_letter in alphabet:
      temporary_result.append(original_sequence + new_letter)
  result = temporary_result[:]

where the [:] is a simple way of copying an array.

will
  • 10,260
  • 6
  • 46
  • 69