Create List without similar crossovers

Question

I am trying to build a list of length = 120, which shall include 4 numbers. The tricky thing is that the numbers shall not appear in a row and each number should occur exactly to the same amount.

This is my script.

import random
List = [1,2,3,4]
seq_CG = random.sample(List,len(List))
for i in range(121):
    randomList = random.sample(List, len(List))
    if randomList[0]!=seq_CG[-1]:
        seq_CG.append(randomList)
        i = len(seq_CG)
print List, randomList, seq_CG

I am pretty close. However something does not work. And maybe there is even a shorter and more random solution?

In the big list seq_CG i do not want that numbers appear in a row. In my example it is filled with a lot of smaller lists. However it would be even nicer to have a random list of 120 numbers with an equal distribution of each number, where numbers do not appear in a row.

`However something does not work` <- could you please explain what you mean here? What is the observed output? What output did you want? How are the two different? — inspectorG4dget, Oct 12 '16 at 15:59
Numbers shall not appear in a row. So there should not be [1,1] in the list. However this is the case. I guess seq_CG[-1] does not take the last of the most actual list. — SDahm, Oct 12 '16 at 16:02
`Numbers shall not appear in a row` suggests that `[1,1]` should not be in `randomList`, which according to your code, it will not be. Beyond that, I really don't understand where you're having troubles — inspectorG4dget, Oct 12 '16 at 16:15

PM 2Ring · Accepted Answer · 2016-10-13T13:59:11.007

Here are a couple of solutions.

The first algorithm maintains an index idx into the sequence and on each call idx is randomly modified to a different index so it's impossible for a yielded value to equal the previous value.

from random import randrange
from itertools import islice
from collections import Counter

def non_repeating(seq):
    m = len(seq)
    idx = randrange(0, m)
    while True:
        yield seq[idx]
        idx = (idx + randrange(1, m)) % m

seq = [1, 2, 3, 4]
print(''.join(map(str, islice(non_repeating(seq), 60))))

ctr = Counter(islice(non_repeating(seq), 12000))
print(ctr)

typical output

313231412323431321312312131413242424121414314243432414241413
Counter({1: 3017, 4: 3012, 3: 2993, 2: 2978})

The distribution of values produced by that code looks fairly uniform, but I haven't analysed it mathematically, and I make no guarantees as to its uniformity.

The following code is more complex, but it does give a uniform distribution. Repeated values are not discarded, they are temporarily added to a pool of repeated values, and the algorithm tries to use values in the pool as soon as possible. If it can't find a suitable value in the pool it generates a new random value.

from random import choice
from itertools import islice
from collections import Counter

def non_repeating(seq):
    pool = []
    prev = None
    while True:
        p = set(pool).difference([prev])
        if p:
            current = p.pop()
            pool.remove(current)
        else:
            current = choice(seq)
            if current == prev:
                pool.append(current)
                continue
        yield current
        prev = current

seq = [1, 2, 3, 4]
print(''.join(map(str, islice(non_repeating(seq), 60))))

ctr = Counter(islice(non_repeating(seq), 12000))
print(ctr)

typical output

142134314121212124343242324143123212323414131323434212124232
Counter({4: 3015, 2: 3005, 3: 3001, 1: 2979})

If the length of the input sequence is only 2 or 3 the pool can get quite large, but for longer sequences it generally only holds a few values.

Finally, here's a version that gives an exactly uniform distribution. Do not attempt to use it on an input sequence of 2 (or fewer) elements because it's likely to get stuck in an infinite loop; of course, there are only 2 solutions for such an input sequence anyway. :)

I'm not proud of this rather ugly code, but at least it does the job. I'm creating an output list of length 60 so that it fits nicely on the screen, but this code has no trouble generating much larger sequences.

from random import shuffle
from itertools import groupby
from collections import Counter

def non_repeating(seq, copies=3):
    seq = seq * copies
    while True:
        shuffle(seq)
        result, pool = [], []
        for k, g in groupby(seq):
            result.append(k)
            n = len(list(g)) - 1
            if n:
                pool.extend(n * [k])

        for u in pool:
            for i in range(len(result) - 1):
                if result[i] != u != result[i + 1]:
                    result.insert(i+1, u)
                    break
            else:
                break
        else:
            return result

# Test that sequence doesn't contain repeats
def verify(seq):
    return all(len(list(g)) == 1 for _, g in groupby(seq))

seq = [1, 2, 3, 4]
result = non_repeating(seq, 15)
print(''.join(map(str, result)))
print(verify(result))
print(Counter(result))

typical output

241413414241343212423232123241234123124342342141313414132313
True
Counter({1: 15, 2: 15, 3: 15, 4: 15})

Maybe i need to specify the question once more. I really need the same count of each number. I know that with long sequences they approximate. But i need them equal in a sequence of 120 numbers. — SDahm, Oct 13 '16 at 12:24
@SDahm: Sorry, I didn't realise that you wanted the distribution to be _exactly_ uniform. That's possible, but of course it does make the resulting sequence _much_ less random. — PM 2Ring, Oct 13 '16 at 13:07

Jon Clements · Answer 2 · 2016-10-12T17:15:34.760

2

A slightly naive approach is to have an infinite loop, then scrunch up duplicate values, using islice to cap the total required output, eg:

from itertools import groupby
from random import choice

def non_repeating(values):
    if not len(values) > 1:
        raise ValueError('must have more than 1 value')
    candidates = iter(lambda: choice(values), object())
    # Python 3.x -- yield from (k for k, g in groupby(candidates))
    # Python 2.x
    for k, g in groupby(candidates):
        yield k

data = [1, 2, 3, 4]
sequence = list(islice(non_repeating(data), 20))
# [3, 2, 1, 4, 1, 4, 1, 4, 2, 1, 2, 1, 4, 1, 4, 3, 2, 3, 4, 3]
# [3, 2, 3, 4, 1, 3, 1, 4, 1, 4, 2, 1, 2, 3, 2, 4, 1, 4, 2, 3]
# etc...

edited Oct 12 '16 at 17:15

answered Oct 12 '16 at 16:35

Jon Clements

138,671
33
247
280

This looks promising. However i get a SyntaxError: invalid syntax for the yield form ... I am using python 2.7 in OpenSesame. – SDahm Oct 12 '16 at 17:12
I added "import groupby, islice". But now list object is not callable. – SDahm Oct 12 '16 at 17:35
@SDahm on what line? – Jon Clements Oct 12 '16 at 17:52
Strange. Now it works :-) I only closed and opend the programm ?? One thing is stil missing. The numbers should occur the same amount of time. In the example each 5 times. In the first line there are only four 3s. – SDahm Oct 12 '16 at 18:11
1

@SDahm that's an entirely different question - this only guarantees non-consecutive occurrences - which before your edit is a valid answer :) – Jon Clements Oct 12 '16 at 18:19
Still searching for a solution...Any other ideas? Or rearrangements of the already posted ideas? – SDahm Oct 13 '16 at 08:49

Create List without similar crossovers

2 Answers2