0

Given a set {1, 2, 3, 4, 5 ... n} of n elements, we need to find all subsets of length up to k.

For example,

Input: n = 4 and k = 2

Output: {{1}, {2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}}

private static final List<Set<Set<Integer>>> innerPowerSets = new ArrayList<>();

public Set<Set<Integer>> powerSet(final int numberOfItems, final int maximumSubsetLength, final List<Set<Set<Integer>>> innerPowerSets) {
        if (innerPowerSets.isEmpty()) {
            innerPowerSets.add(new HashSet<>());
            innerPowerSets.get(0).add(new HashSet<>());
        } else {
            log.info("Saved Power Sets: " + innerPowerSets.size());
        }

        final int[] missingPowerSets;

        if (numberOfItems+1 > innerPowerSets.size()) {
            missingPowerSets = IntStream.range(innerPowerSets.size(), numberOfItems+1).toArray();
        } else {
            return innerPowerSets.get(numberOfItems);
        }

        for (final Integer item : missingPowerSets) {
            final Set<Set<Integer>> previousPowerSet = innerPowerSets.get(innerPowerSets.size() - 1);
            final Set<Set<Integer>> temp = new HashSet<>(previousPowerSet);
            for (Set<Integer> innerSet : previousPowerSet) {
                innerSet = new HashSet<>(innerSet);
                if (innerSet.size() < maximumSubsetLength) {
                    innerSet.add(item);
                    temp.add(innerSet);
                }
            }
            innerPowerSets.add(new HashSet<>(temp));
        }
        return innerPowerSets.get(innerPowerSets.size()-1);
    }

The above code is in iterative pattern with memoization, the reason being I need to call it multiple times and don't want to waste time in calculating the same subsets again and again.

Problem: I have a list of objects for which I need subsets of length up to k. I used the above code to get the subsets of indices and then directly uses this indices_subset to get the Object_subsets. Storing subsets of indices helps me to apply it to any length of the Object list. But the problem is, it is taking too much time. If I remove the memoization and directly apply power set calculation of length up to k, it is quite fast.

If some more information is required please comment.

Direct Object power set up to length k with iterative pattern calculator:

public static <T> List<List<T>> powerSet(final List<T> originalSet, final int subsetLengthOf) {
   final List<List<T>> result = new ArrayList<>();
   result.add(new ArrayList<>());
   for (final T item : originalSet) {
      final List<List<T>> temp = new ArrayList<>();
      for (List<T> innerSet : result) {
         innerSet = new ArrayList<>(innerSet);
         if (innerSet.size() < subsetLengthOf) {
            innerSet.add(item);
            temp.add(innerSet);
         }
      }
      result.addAll(temp);
   }
   return result;
}
mux032
  • 65
  • 2
  • 8

1 Answers1

0

You aren't really getting benefits of memoization as you're claming your first approach is doing.

Memoization as a concept:

Memoization is an approach to avoid re-computation of same problem (same inputs). You avoid re-computation by simply storing the result (against a specific input) and when the same input is given, you don't re-compute and simply do a lookup in your array and return the result. This is the main saving that you're doing using memoizations.

Notice here that the more you reuse the previous sub-problem's solutions, the more performance benefit you'll get. In your first approach, you aren't re-using much. (You're just re-using the previous subproblem's solution - and creating a new solution)


Let us define the problem as P(numberOfItems, maximumSubsetLength).

You're trying to find P(numberOfItems, maximumSubsetLength) using P(numberOfItems, maximumSubsetLength - 1). And that's the case with both the approaches. The only difference is that in your first approach, you're storing all the results (without mutating them) in innerPowerSets.

If you notice closely you're just using previousPowerSet and nothing else from the innerPowerSets final Set<Set<Integer>> previousPowerSet = innerPowerSets.get(innerPowerSets.size() - 1); And this is exactly what your second approach is doing. The only difference is that it's not storing result of every sub-problem. It's just using the previous sub-problem's solution and mutating that to get the next sub-problem's solution. And hence you're getting performance improvements.


Crux: The more you re-use the more performance benefit you'll get. Both of your approaches are just using the previous sub-problem's solution (and nothing else). And the first solution (which you're claiming is memoized one) is doing extra work to create copies of existing solution and then mutate it to get current solution. The second iterative approach is simply mutating the previous solution to get current sub-problem's solution (without any extra work of copying - as it's simply mutating the previous one)

Lavish Kothari
  • 2,211
  • 21
  • 29