6

Given a set** S containing duplicate elements, how can one determine the total number all the possible subsets of S, where each subset is unique.

For example, say S = {A, B, B} and let K be the set of all subsets, then K = {{}, {A}, {B}, {A, B}, {B, B}, {A, B, B}} and therefore |K| = 6.

Another example would be if S = {A, A, B, B}, then K = {{}, {A}, {B}, {A, B}, {A, A}, {B, B}, {A, B, B}, {A, A, B}, {A, A, B, B}} and therefor |K| = 9

It is easy to see that if S is a real set, having only unique elements, then |K| = 2^|S|.

What is a formula to calculate this value |K| given a "set" S (with duplicates), without generating all the subsets?

** Not technically a set.

Nixuz
  • 3,439
  • 4
  • 39
  • 44
  • This is really a math question, not a programming question. – Eddie Apr 10 '09 at 02:04
  • It is for a programming related problem I have and such a formula is important for analyzing the running time of certain combinatorics related algorithms. – Nixuz Apr 10 '09 at 02:28

2 Answers2

16

Take the product of all the (frequencies + 1).

For example, in {A,B,B}, the answer is (1+1) [the number of As] * (2+1) [the number of Bs] = 6.

In the second example, count(A) = 2 and count(B) = 2. Thus the answer is (2+1) * (2+1) = 9.

The reason this works is that you can define any subset as a vector of counts - for {A,B,B}, the subsets can be described as {A=0,B=0}, {A=0,B=1}, {0,2}, {1,0}, {1,1}, {1,2}.

For each number in counts[] there are (frequencies of that object + 1) possible values. (0..frequencies)

Therefore, the total number of possiblities is the product of all (frequencies+1).

The "all unique" case can also be explained this way - there is one occurence of each object, so the answer is (1+1)^|S| = 2^|S|.

v3.
  • 2,003
  • 15
  • 18
  • As soon as I read even just the first part of your answer, I could see that it was right. Now I feel stupid for not seeing this as it is pretty obvious to see once someone explains it. Either way thank you, you have saved me lots of time and frustration. – Nixuz Apr 10 '09 at 02:50
  • But how is it handling the order of numbers in the subset – Nikhil Rathore Jan 25 '19 at 17:26
  • I don't think it is correct. For set (a, b, a) the number of subsets should be (2 + 1) * (1 + 1) = 6. But including the empty subset there are 7 subsets. – Nikhil Rathore Jan 25 '19 at 17:40
  • @NikhilRathore, how did you get the 7th one? It makes more sense if you represent list as list of counts as the author mentioned. Given your example (a, b, a) -> (a=2, b=1), there can only be 6: (a=0, b=0) [empty set], (a=0, b=1), (a=1, b=0), (a=1, b=1), (a=2, b=0), (a=2, b=1). – Harry Cho Mar 18 '19 at 00:16
4

I'll argue that this problem is simple to solve, when viewed in the proper way. You don't care about order of the elements, only whether they appear in a subset of not.

Count the number of times each element appears in the set. For the one element set {A}, how many subsets are there? Clearly there are only two sets. Now suppose we added another element, B, that is distinct from A, to form the set {A,B}. We can form the list of all sets very easily. Take all the sets that we formed using only A, and add in zero or one copy of B. In effect, we double the number of sets. Clearly we can use induction to show that for N distinct elements, the total number of sets is just 2^N.

Suppose that some elements appear multiple times? Consider the set with three copies of A. Thus {A,A,A}. How many subsets can you form? Again, this is simple. We can have 0, 1, 2, or 3 copies of A, so the total number of subsets is 4 since order does not matter.

In general, for N copies of the element A, we will end up with N+1 possible subsets. Now, expand this by adding in some number, M, of copies of B. So we have N copies of A and M copies of B. How many total subsets are there? Yes, this seems clear too. To every possible subset with only A in it (there were N+1 of them) we can add between 0 and M copies of B.

So the total number of subsets when we have N copies of A and M copies of B is simple. It must be (N+1)*(M+1). Again, we can use an inductive argument to show that the total number of subsets is the product of such terms. Merely count up the total number of replicates for each distinct element, add 1, and take the product.

See what happens with the set {A,B,B}. We get 2*3 = 6.

For the set {A,A,B,B}, we get 3*3 = 9.