I am trying to solve a problem coming from the area of biology in which I have to combine local sub-optimal solutions from each big element such that each sub particle is unique. The problem is that the possibilities could scale up to +4.000 local sub-optimal solutions and to +30.000 for the elements. Cartesian product is not an option as combining the lists is a n*m*p*
... problem, impossible without an algorithm beyond itertools
.
The general schema is:
[
[ [a,b,c],[d,e,a],[f], ...],
[ [f,e,t],[a,b,t],[q], ...],
[ [a,e,f],[],[p], ... up to 4.000],
... up to 30.000
]
[ [a,b,c],[d,e,a],[f],.....],
-> group of sub-optimal solutions for elem. #1
I want to find as fast as possible
First: one solution, which means a combination of one sub-optimal solution for each element (could include blank lists) such that there are no duplicates. For example [[a,b,c],[f,e,t][p]].
Second: all the compatible solutions.
I know is an open questions, however I need some guidance or general algorith to confront this problem, I can investigate further if I have something to start with.
I am using python for the rest of the lab work, however I am open to other languages.
We can start from a basic solver that handle less possibilities in terms of total sub-optimal and number of list of lists.
Best.
EDIT 1
A very-short-real example:
[[[1,2,3][1,2,4],[1,2,5],[5,8]],
[[1,3][7,8],[6,1]],
[[]],
[[9,10][7,5],[6,9],[6,10]]]
OPTIMAL SOLUTION (FROM ROW #):
#1 [1,2,3]
#2 [7,8]
#3 [9,10]
Output: [[1,2,3],[7,8],[9,10]]
Can see here https://pastebin.com/qq4k2FdW