3
from itertools import *
import collections
for i in combinations_with_replacement(['0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f'],15):
    b = (''.join(i))
    freq = collections.Counter(b)
    for k in freq:
        if freq [k] < 5:
            print(k)

this code most print chars what count if less than 5

what i try do , cheek if at string from join at fly if there is repeated any of characters les than x times at any possition of that string and print strings only what true to that.

Problem is no mater what i try do , or its print all and ignore if ... or print notting. how do it right , or maybe at python exist simple solution ?

Result most be as example les than 5

False - fffaaffbbdd ( repeat 5 titemes f)
False - fffffaaaaac ( repeat 5 times a and f)
True -  aaabbbccc11 ( no any character repeated more than 4 times )

More clear explain qustion - filter all string with characters more than x repetions before give to next function. As examble - there is simple print that strings , and not print strings what not at rule.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
tseries
  • 723
  • 1
  • 6
  • 14

2 Answers2

4

If I understand you right, you want to print strings where each character is found only 4-times at maximum:

from collections import Counter
from itertools import combinations_with_replacement


for i in combinations_with_replacement(['0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f'],15):
    c = Counter(i)
    if c.most_common(1)[0][1] > 4:
        continue
    print(''.join(i))

Prints:

...

00002446899cccd
00002446899ccce
00002446899cccf
00002446899ccdd

... 
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • There is somting wrong , this now work as expected. try simple rune that combination and then run with this , its make ^x slow down. – tseries Oct 13 '20 at 13:48
  • i need filtler all possible repition character , from give to next function for make faster next function be filtering no need data . But at this example for some reason faster simple work with all data and then filter it be parret with startwith.x ... , next function is 2 round hashing sha256 , so if it slower than hashing it no usable :) – tseries Oct 13 '20 at 13:53
  • Solution examble of solution is usable but only for "slow work" :( – tseries Oct 13 '20 at 14:23
1

a more constructive approach (meaning: i do not iterate over all possible combinations - i construct the valid combinations directly).

you need to have sympy installed for this to work.

in the example i only use the elements "abcdef" and restrict the repetitions to be strictly smaller than MAX = 4. i fix the length of the strings to be output at M = 6.

i start by getting all the partitions of M with restricted repetitions k=MAX - 1 and not constisting of more than m=N parts. i immediately convert those to a list:

{3: 2} [3, 3, 0, 0, 0, 0]
{3: 1, 2: 1, 1: 1} [3, 2, 1, 0, 0, 0]
{3: 1, 1: 3} [3, 1, 1, 1, 0, 0]
{2: 3} [2, 2, 2, 0, 0, 0]
{2: 2, 1: 2} [2, 2, 1, 1, 0, 0]
{2: 1, 1: 4} [2, 1, 1, 1, 1, 0]
{1: 6} [1, 1, 1, 1, 1, 1]

of those lists i iterate over the multiset permutations - i mean those to represent the elements that i select and how often they are repeated: e.g:

[2, 1, 2, 0, 0, 1] -> "aabccf"  # 2*"a", 1*"b", ..., 0*"e", 1*"f"

the result you want is then the multiset permutation of those strings.

from sympy.utilities.iterables import multiset_permutations, partitions

MAX = 4  # (all counts < MAX)
elements = "abcdef"
N = len(elements)
M = 6  # output length


def dict_to_list(dct, N):
    ret = [0] * N
    j = 0
    for k, v in dct.items():
        ret[j:j + v] = [k] * v
        j += v
    return ret


for dct in partitions(M, k=MAX - 1, m=N):
    lst = dict_to_list(dct, N)
    for part in multiset_permutations(lst):
        el = ''.join(n * v for n, v in zip(part, elements))
        for msp in multiset_permutations(el):
            print(''.join(msp))

for your case you'd then need to change:

MAX = 5  # (all counts < MAX)
elements = "0123456789abcdef"
M = 15  # output length

but the complexity of that is huge (but way better that the one of the original approach)!

hiro protagonist
  • 44,693
  • 14
  • 86
  • 111
  • for msp in multiset_permutations(el, size=15): but work only for les than elements and dont know if it's right work at this variants . – tseries Oct 13 '20 at 16:16
  • i undestand i can simple repeat 2 times elemets and make string more len ( but its stuped way, and wast time for again generate all duplicates... , dont see any way at documentation sympy how control it like at product or combinations. – tseries Oct 13 '20 at 16:36
  • Try simple max=4 and size=3 get FFF result. – tseries Oct 13 '20 at 16:42
  • So there is controls count of symbols , but lossed control of len size :( – tseries Oct 13 '20 at 16:51
  • Not only 15 , is how examble but if think some how need make it controled like at simple permutation , combination , product at itertools. How i undestand for this exist size=x at sympy , but it work strage. if use max=2 and size=3 it work. – tseries Oct 13 '20 at 16:58
  • work the same as put for msp in multiset_permutations(el, size=15): , ( repeat the same combinations many times) but work for more than 7 simbols , but dont know if correct . – tseries Oct 13 '20 at 18:12
  • Need cheek simple from small range , for repetion and known permutations maximum - no used. – tseries Oct 13 '20 at 18:19
  • could you give me an example of `elements, M, MAX` where you see a combination appearing more than once? – hiro protagonist Oct 13 '20 at 18:29
  • its not repeat if M more than N but if less ( see updated qustion ) – tseries Oct 13 '20 at 18:49
  • + better if possible is at sympy exist somting like product at itertools? what can give all possible variats but filter it be rule , its be more clear for testing than simple permutation. – tseries Oct 13 '20 at 18:57
  • At this point simple put for msp in multiset_permutations(el, size=15) as examble work for less , and M work if M more than MAX . – tseries Oct 13 '20 at 18:59
  • tried to fix that. added the parameter `m=N` that limits the number of partitions. – hiro protagonist Oct 13 '20 at 20:28
  • somting is wrong at latest variant result ccdcefffedefedd ccdcefffedfddee when elements = "1234567890abcdef" so its no only use last 4 symbols from list – tseries Oct 14 '20 at 09:40
  • i do not understand. the counters (`Counter("ccdcefffedefedd").most_common()`) for your 2 examples show that no element was selected more than 4 times. `elements` is the pool of elements i select from. if you only want `cdef` in your result you need to set `elements = "cdef"`. – hiro protagonist Oct 14 '20 at 10:13
  • all elemeths most be mixed but no repeat more than x times at string no mater position. this solution select only last 4 elemets from elemets and mix only them :) as example 16772fff299f86 true ( if no more than 4) 156345786abc true but if 116f6f4f7afbcf false ( 5 times F) – tseries Oct 14 '20 at 10:26
  • no, it creates **all** of them. but the space is so huge that you only get to see the first few combinations (if you do not run the loops to the end - which will take a **very long** time). if you reduce the number of elements you will see that all valid combinations are created. – hiro protagonist Oct 14 '20 at 10:29
  • Undestand , i was think it only select last. – tseries Oct 14 '20 at 10:30
  • try smaller spaces like `elements = "0123456"; MAX = 3; M = 8` in order to see what the algorithm does. it is also easier to understand that way. – hiro protagonist Oct 14 '20 at 10:37
  • Yes it work , but again "long time" before it start mixe all make it no usable - as examble crypto ( before i give to next function if i know there be only valid no more than 4 repetions of any hex symbol :( but it's correct answer qustion. but its good algo for mix "DNA", "RNA" etc... :) – tseries Oct 14 '20 at 14:55