I have a list in a for
loop and it uses itertools.product()
to find different combinations of letters. I want to use collections.Counter()
to count the number of occurrences of an item, however, right now it prints all the different combinations of "A"'s and "G"'s:
['a', 'A', 'G', 'G']
['a', 'A', 'G', 'g']
['a', 'A', 'G', 'G']
['a', 'A', 'G', 'g']
['a', 'A', 'G', 'g']
#...
['a', 'G', 'A', 'G']
['a', 'G', 'a', 'g']
['a', 'G', 'A', 'G']
['a', 'G', 'a', 'G']
['a', 'G', 'a', 'G']
#...
['a', 'G', 'a', 'G']
['a', 'G', 'A', 'G']
['a', 'G', 'a', 'g']
['a', 'G', 'A', 'G']
['a', 'G', 'a', 'G']
#...
['a', 'G', 'A', 'G']
['a', 'G', 'a', 'G']
['a', 'G', 'a', 'G']
# etc.
Now, this isn't all of them, but as you can see, there are some occurrences that are the same although ordered differently, for example:
['a', 'G', 'A', 'G']
['a', 'A', 'G', 'G']
I would much prefer the latter ordering, so I want to find a way to print all of the combinations with capital letters before lower case, and because 'a' is before 'g', also alphabetically. The final product should look like ['AaGG', 'aaGg', etc]
. What function or functions should I use?
This is the code that generates the data. The section marked "Counting" is what I'm having trouble with.
import itertools
from collections import Counter
parent1 = 'aaGG'
parent2 = 'AaGg'
f1 = []
f1_ = []
genotypes = []
b = []
genetics = []
g = []
idx = []
parent1 = list(itertools.combinations(parent1, 2))
del parent1[0]
del parent1[4]
parent2 = list(itertools.combinations(parent2, 2))
del parent2[0]
del parent2[4]
for x in parent1:
f1.append(''.join(x))
for x in parent2:
f1_.append(''.join(x))
y = list(itertools.product(f1, f1_))
for x in y:
genotypes.append(''.join(x))
break
genotypes = [
thingies[0][0] + thingies[1][0] + thingies[0][1] + thingies[1][1]
for thingies in zip(parent1, parent2)
] * 4
print 'F1', Counter(genotypes)
# Counting
for genotype in genotypes:
alleles = list(itertools.combinations(genotype,2))
del alleles[1]
del alleles[3]
for x in alleles:
g.append(''.join(x))
for idx in g:
if idx.lower().count("a") == idx.lower().count("g") == 1:
break
f2 = list(itertools.product(g, g))
for x in f2:
genetics.append(''.join(x))
for genes in genetics:
if genes.lower().count("a") == genes.lower().count("g") == 2:
genes = ''.join(genes)
print Counter(genes)