I am using a piece of itertools code (thanks SO!) that looks like:
# Break down into selectable sub-groups by unit name
groups = {k: [b for b in blobs if b.Unit==k] for k in ['A','B','C','D','E','F']}
# Special treatment for unit F: expand to combination chunks of length 3
groups['F'] = combinations(groups['F'], 3)
# Create the list of all combinations
selected = list(product(*groups.values()))
The problem is that my blobs
list above contains about 400 items, which means that the resulting list object, selected
, would have trillions & trillions of possible combinations (something like 15x15x15x15x15x15x15x15x15). I'm not new to programming, but I am new to working with large data sets. What kind of hardware should I be looking for to handle itertools like this? Are there any reasonably affordable machines that can handle this type of thing? I've obviously taken my Python skills beyond my trusty iMac . . .