3

I am trying to create a subset of a list, covering every possible combination with the condition that final output is the same length as the initial list and there are no repeating elements.

For the list:

X <- c("A","B","C","D")

All the non-null subsets are (let's call it Y):

[('A'), ('B'), ('C'), ('D'), ('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'C'),
('B', 'D'), ('C', 'D'), ('A', 'B', 'C'), ('A', 'B', 'D'), ('A', 'C', 'D'), 
('B', 'C', 'D'), ('A', 'B', 'C', 'D')]

What I am looking for is combinations of Y such that the elements within the combination are distinct values of X.

Some of the acceptable combinations would be:

 (('A',), ('B',), ('C', 'D'))
 (('A',), ('C',), ('B', 'D'))
 (('A',), ('D',), ('B', 'C'))
 (('B',), ('C',), ('A', 'D'))
 (('B',), ('D',), ('A', 'C'))
 (('C',), ('D',), ('A', 'B'))

I have tried estimating all possible combinations of Y and then getting the length of the distinct values of each combination.

If the length(distinct elements of combination) = length(X) then I keep the combination. But this isn't an optimal method by any means and does not cover repeating scenarios.

Also, in my real world scenario, I have up to 40 distinct elements in X.

M--
  • 25,431
  • 8
  • 61
  • 93
  • 1
    You could use `partitions::listParts()` or `partitions::setparts()` as, e.g., [here](https://stackoverflow.com/a/10667092/980833). However, the number of partitions for a set of 40 items will be absolutely astronomical (some number like 40! or likely much greater), so you won't come close to being able to enumerate them all. – Josh O'Brien May 13 '19 at 21:03
  • @JoshO'Brien Thanks Josh, I did end up using that, but yes the data set does get huge and my machine bogs down at sets greater than 10. – constraint_random May 14 '19 at 19:47
  • @M-M I am trying a way to reduce my data set, or split it so that I can get all the combinations separately and go a cross combination on them, that might reduce the load. I'll update if I find a more efficient way – constraint_random May 14 '19 at 19:50

1 Answers1

0
X = c("A","B","C","D")
  1. use combn()
comb = c()
for(n in 1:length(X)){
  comb = c(comb, apply(combn(X, n), MARGIN = 2, FUN = "paste", collapse = ""))
}
comb
 [1] "A"    "B"    "C"    "D"    "AB"   "AC"   "AD"   "BC"   "BD"   "CD"   "ABC"  "ABD"  "ACD" 
[14] "BCD"  "ABCD"
  1. use expand.grid()
expand.grid(X, X)
   Var1 Var2
1     A    A
2     B    A
3     C    A
4     D    A
5     A    B
6     B    B
7     C    B
8     D    B
9     A    C
10    B    C
11    C    C
12    D    C
13    A    D
14    B    D
15    C    D
16    D    D
M--
  • 25,431
  • 8
  • 61
  • 93