Lame string-based testing whether "11" is contained in the formatted string and yield if it's not (for every even number up to 2^maxlen):
def gen(maxlen):
pattern = "{{:0{}b}}".format(maxlen)
for i in range(0, 2**maxlen, 2):
s = pattern.format(i) # not ideal, because we always format to test for "11"
if "11" not in s:
yield s
Superior mathematical approach (M xor M * 2 = M * 3
):
def gen(maxlen):
pattern = "{{:0{}b}}".format(maxlen)
for i in range(0, 2**maxlen, 2):
if i ^ i*2 == i*3:
yield pattern.format(i)
Here's a benchmark for 6 different implementations (Python 3!):
from time import clock
from itertools import product
def math_range(maxlen):
pattern = "{{:0{}b}}".format(maxlen)
for i in range(0, 2**maxlen, 2):
if i ^ i*2 == i*3:
yield pattern.format(i)
def math_while(maxlen):
pattern = "{{:0{}b}}".format(maxlen)
maxnum = 2**maxlen - 1
i = 0
while True:
if i ^ i*2 == i*3:
yield pattern.format(i)
if i >= maxnum:
break
i += 2
def itertools_generator(max_len):
return filter(lambda i: '11' not in i, (''.join(i) + '0' for i in product('01', repeat=max_len-1)))
def itertools_list(maxlen):
return list(filter(lambda i: '11' not in i, (''.join(i) + '0' for i in product('01', repeat=maxlen-1))))
def string_based(maxlen):
pattern = "{{:0{}b}}".format(maxlen)
for i in range(0, 2**maxlen, 2):
s = pattern.format(i)
if "11" not in s:
yield s
def generate(pre0, pre1, cur_len, max_len):
if (cur_len == max_len-1):
yield "".join((pre0, pre1, "0"))
return
if (pre1 == '1'):
yield from generate(pre0+pre1, "0", cur_len+1, max_len)
else:
yield from generate(pre0+pre1, "0", cur_len+1, max_len)
yield from generate(pre0+pre1, "1", cur_len+1, max_len)
def string_based_smart(val):
yield from generate("", "", 0, val)
def benchmark(val, *funcs):
for i, func in enumerate(funcs, 1):
start = clock()
for g in func(val):
g
print("{}. {:6.2f} - {}".format(i, clock()-start, func.__name__))
benchmark(24, string_based_smart, math_range, math_while, itertools_generator, itertools_list, string_based)
Some numbers for string length = 24 (in seconds):
1. 0.24 - string_based_smart
2. 1.73 - math_range
3. 2.59 - math_while
4. 6.95 - itertools_generator
5. 6.78 - itertools_list
6. 6.45 - string_based
shx2's algorithm is clearly the winner, followed by math. Pythonic code makes quite a difference if you compare the results of both math approaches (note: ranges are also generators).
Noteworthy: the itertools_*
functions perform almost equally slow, but itertools_list
needs a lot more memory to store the list in (~6 MB spike in my test). All other generator-based solutions have a minimal memory footprint, because they only need to store the current state and not the entire result.
None of the shown functions blows up the stack, because they do not use actual recursion. Python does not optimize tail recursion, thus you need loops and generators.
//edit: naive C++ implementation of math_range
(MSVS 2013):
#include "stdafx.h"
#include <iostream>
#include <bitset>
#include <ctime>
#include <fstream>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
const unsigned __int32 maxlen = 24;
const unsigned __int32 maxnum = 2 << (maxlen - 1);
clock_t begin = clock();
ofstream out;
out.open("log.txt");
if (!out.is_open()){
cout << "Can't write to target";
return 1;
}
for (unsigned __int32 i = 0; i < maxnum; i+=2){
if ((i ^ i * 2) == i * 3){
out << std::bitset<maxlen>(i) << "\n"; // dont use std::endl!
}
}
out.close();
clock_t end = clock();
double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
cout << elapsed_secs << endl;
return 0;
}
It takes 0.08 seconds(!) for maxlen = 24 (/Ox
).
An implementation of shx2's algorithm in C++ is non-trivial, because a recursive approach would lead to stack overflow (ha ha), and there's no yield
. See:
But if you want raw speed, then there's no way around it.