2

I'm trying to figure out the time complexity of a function that I wrote (it generates a power set for a given string):

public static HashSet<string> GeneratePowerSet(string input)
{
    HashSet<string> powerSet = new HashSet<string>();

    if (string.IsNullOrEmpty(input))
        return powerSet;

    int powSetSize = (int)Math.Pow(2.0, (double)input.Length);

    // Start at 1 to skip the empty string case
    for (int i = 1; i < powSetSize; i++)
    {
        string str = Convert.ToString(i, 2);
        string pset = str;
        for (int k = str.Length; k < input.Length; k++)
        {
            pset = "0" + pset;
        }

        string set = string.Empty;
        for (int j = 0; j < pset.Length; j++)
        {
            if (pset[j] == '1')
            {
                set = string.Concat(set, input[j].ToString());
            }
        }
        powerSet.Add(set);
    }
    return powerSet;
}

So my attempt is this:

  • let the size of the input string be n
  • in the outer for loop, must iterate 2^n times (because the set size is 2^n).
  • in the inner for loop, we must iterate 2*n times (at worst).

1. So Big-O would be O((2^n)*n) (since we drop the constant 2)... is that correct?

And n*(2^n) is worse than n^2.

if n = 4 then
(4*(2^4)) = 64
(4^2) = 16

if n = 100 then
(10*(2^10)) = 10240
(10^2) = 100

2. Is there a faster way to generate a power set, or is this about optimal?

Kiril
  • 39,672
  • 31
  • 167
  • 226
  • I don't actually know c#, but I believe that string.concat has to copy the string each call, which means that the complexity is actually O(n^2*2^n). – Chris Hopman Jan 08 '11 at 11:14

2 Answers2

4

A comment:

the above function is part of an interview question where the program is supposed to take in a string, then print out the words in the dictionary whose letters are an anagram subset of the input string (e.g. Input: tabrcoz Output: boat, car, cat, etc.). The interviewer claims that a n*m implementation is trivial (where n is the length of the string and m is the number of words in the dictionary), but I don't think you can find valid sub-strings of a given string. It seems that the interviewer is incorrect.

I was given the same interview question when I interviewed at Microsoft back in 1995. Basically the problem is to implement a simple Scrabble playing algorithm.

You are barking up completely the wrong tree with this idea of generating the power set. Nice thought, clearly way too expensive. Abandon it and find the right answer.

Here's a hint: run an analysis pass over the dictionary that builds a new data structure more amenable to efficiently solving the problem you actually have to solve. With an optimized dictionary you should be able to achieve O(nm). With a more cleverly built data structure you can probably do even better than that.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • thanks... I was able to get the O(nm) solution and I submitted both of my solutions (this way they see what I did in the process). – Kiril Jan 08 '11 at 23:16
  • @Lirik: awesome. I assume that what you're doing is internally sorting the dictionary words, as I describe here http://blogs.msdn.com/b/ericlippert/archive/tags/scrabble/ (note that I am solving the simpler problem of finding all the bingos). Now, *can you do even better*? **What if you built a trie out of the canonicalized word list?** I would be interested to see if you can come up with a solution that is even faster than O(nm) with a trie, or a proof as to why it's not possible. – Eric Lippert Jan 09 '11 at 15:52
1

2. Is there a faster way to generate a power set, or is this about optimal?

Your algorithm is reasonable, but your string handling could use improvement.

string str = Convert.ToString(i, 2);
string pset = str;
for (int k = str.Length; k < input.Length; k++)
{
    pset = "0" + pset;
}

All you're doing here is setting up a bitfield, but using a string. Just skip this, and use variable i directly.

for (int j = 0; j < input.Length; j++)
{
    if (i & (1 << j))
    {

When you build the string, use a StringBuilder, not creating multiple strings.

// At the beginning of the method
StringBuilder set = new StringBuilder(input.Length);
...
// Inside the loop
set.Clear();
...
set.Append(input[j]);
...
powerSet.Add(set.ToString());

Will any of this change the complexity of your algorithm? No. But it will significantly reduce the number of extra String objects you create, which will provide you a good speedup.

David Yaw
  • 27,383
  • 4
  • 60
  • 93
  • you made me realize that I could just just do set = set + input[j].ToString(), wouldn't that be as fast as using the StringBuilder? – Kiril Jan 07 '11 at 22:27
  • That would create two additional strings each time you did it: One holding the single letter at `input[j]`, one for the result of the concatenation. Modifying an already existing object is faster. (The `+` operator compiles to `String.Concat` anyway.) Also, see edit: It is more efficient to create & reuse a single StringBuilder than to create and throw away several thousand of them. – David Yaw Jan 07 '11 at 22:41