What is the complexity of the code to find word in a set of cubes

Question

I have solved the program here. Previously I thought complexity was O(n!) where n were characters in the word.

But today I feel it is wrong. It should be (6)^(characters in the word) where 6 is the sides in the cube.

Making it more generic, assuming cube would have more than 6 sides, the complexity should be O(cubefaces ^ (characters in input word))

Can someone please explain me time-complexity in this case?

Are you asking for the time complexity of *your* solution, or the time complexity of the *best possible solution*? — templatetypedef, Jun 12 '15 at 22:07
@templatetypedef-OP is asking about time complexity of his/her solution as posted in that link. — Am_I_Helpful, Jun 12 '15 at 22:08
just FWIW This can be solved in polynomial time, O(n^2) or even better using bipartite matching. — Niklas B., Jun 12 '15 at 22:17
@NiklasB. Oh man, that's beautiful. I don't think I ever realized that connection! — templatetypedef, Jun 12 '15 at 22:22
@NiklasB. do you really know any algorithm that can find a maximal bipartite matching in O(n^2) or better? — dened, Jun 15 '15 at 16:39
@dened Actually I thought there can only be 6n edges, which is of course wrong. Since this is not the case it would be O(n^2.5) without further insights — Niklas B., Jun 15 '15 at 16:41

dened · Answer 1 · 2015-06-16T06:14:23.957

If there is if (cubeMatrix.length != word.length()) return false;, and every side letter of the cube is unique (i.e. no two sides of a cube have the same letter), then the time complexity of your algorithm is O(S^{N - S + 1} S!) when N >= S, and O(S N!) when N <= S. Here S is the number of cube sides, and N is the number of cubes.

In brief, you make the recursive call only then there is a unused letter in the word corresponding to the cube side letter, so, in the worst case the number of times you make the recursive call is not more than the number of word letters left unused. And the number of unused word letters decreases with increasing of the recursion depth, and eventually this number becomes less than the number of cube sides. That's why, in the final recursion depths the complexity becomes factorial.

A bit more details

Let's introduce f(n) that is how many times you call findWordExists with cubeNumber = n. We also introduce g(n) that is how many times findWordExists with cubeNumber = n recursively calls itself (but now with cubeNumber = n + 1).

f(0) = 1, because you call findWordExists non-recursively only once.

f(n) = f(n - 1) g(n - 1) when n > 0.

We know that g(n) = min { S, N - n }, because, as I already pointed out, findWordExists is called recursively no more times than the number of letters left — the if (frequency > 0) check is responsible for this — and the number of letters left equals to the number of cubes left, i.e. N - n.

Now we can calculate how many times findWordExists is called in total:
f(0) + f(1) + ... + f(N) =
= 1 + g(0) + g(0) g(1) + ... + g(0) g(1) ... g(N - 1) =
= 1 + S + S² + ... + S^{N - S} + S^{N - S} (S - 1) + S^{N - S} (S - 1) (S - 2) + ... + S^{N - S} (S - 1) (S - 2) ... 1 =
= O(S^{N - S} S!).

But every findWordExists call (except finals) iterate over each side, so we need to multiply the number of findWordExists calls by the number of sides: S O(S^{N - S} S!) = O(S^{N - S + 1} S!) — and that is our time complexity.

Better Algorithm

Actually, your problem is a bipartite matching problem, so there are much more efficient algorithms than brute force, e.g. Kuhn’s algorithm.

The complexity of Kuhn’s algorithm is O(N M), where N is the number of vertices, and M is the number of edges. In your case, N is the number of cubes, and M is just N², so the complexity in your case could be O(N³). But you also need to iterate over all the sides of all the cubes, so if the number of cube sides is greater than N², then the complexity is O(N S), where S is the number of cube sides.

Here is a possible implementation:

import java.util.*;

public class CubeFind {
    private static boolean checkWord(char[][] cubes, String word) {
        if (word.length() != cubes.length) {
            return false;
        }
        List<Integer>[] cubeLetters = getCubeLetters(cubes, word);
        int countMatched = new BipartiteMatcher().match(cubeLetters, word.length());
        return countMatched == word.length();
    }

    private static List<Integer>[] getCubeLetters(char[][] cubes, String word) {
        int cubeCount = cubes.length;

        Set<Character>[] cubeLetterSet = new Set[cubeCount];
        for (int i = 0; i < cubeCount; i++) {
            cubeLetterSet[i] = new HashSet<>();
            for (int j = 0; j < cubes[i].length; j++) {
                cubeLetterSet[i].add(cubes[i][j]);
            }
        }
        List<Integer>[] cubeLetters = new List[cubeCount];
        for (int i = 0; i < cubeCount; i++) {
            cubeLetters[i] = new ArrayList<>();
            for (int j = 0; j < word.length(); j++) {
                if (cubeLetterSet[i].contains(word.charAt(j))) {
                    cubeLetters[i].add(j);
                }
            }
        }
        return cubeLetters;
    }

    public static void main(String[] args) {
        char[][] m = {{'e', 'a', 'l'} , {'x', 'h' , 'y'},  {'p' , 'q', 'l'}, {'l', 'h', 'e'}};
        System.out.println("Expected true,  Actual: " + CubeFind.checkWord(m, "hell"));
        System.out.println("Expected true,  Actual: " + CubeFind.checkWord(m, "help"));
        System.out.println("Expected false, Actual: " + CubeFind.checkWord(m, "hplp"));
        System.out.println("Expected false, Actual: " + CubeFind.checkWord(m, "hplp"));
        System.out.println("Expected false, Actual: " + CubeFind.checkWord(m, "helll"));
        System.out.println("Expected false, Actual: " + CubeFind.checkWord(m, "hel"));
    }
}

class BipartiteMatcher {
    private List<Integer>[] cubeLetters;
    private int[] letterCube;
    private boolean[] used;

    int match(List<Integer>[] cubeLetters, int letterCount) {
        this.cubeLetters = cubeLetters;
        int cubeCount = cubeLetters.length;
        int countMatched = 0;

        letterCube = new int[letterCount];
        Arrays.fill(letterCube, -1);

        used = new boolean[cubeCount];
        for (int u = 0; u < cubeCount; u++) {
            if (dfs(u)) {
                countMatched++;

                Arrays.fill(used, false);
            }
        }
        return countMatched;
    }

    boolean dfs(int u) {
        if (used[u]) {
            return false;
        }
        used[u] = true;

        for (int i = 0; i < cubeLetters[u].size(); i++) {
            int v = cubeLetters[u].get(i);

            if (letterCube[v] == -1 || dfs(letterCube[v])) {
                letterCube[v] = u;
                return true;
            }
        }
        return false;
    }
}

@shekharsuman, please note, that _n_ corresponds to the `cubeNumber` argument of `findWordExists`. Therefore, f(0) is the number of calls to `findWordExists` with `cubeNumber` = 0. And it is indeed called only once (from `checkWord`): `findWordExists(cubeMatrix, charFreq, 0);`. So, that is not a mistake. — dened, Jun 16 '15 at 10:11
@dened- Yes, now this really sounds an exact(more tighter) evaluation, which I have left as an upper bound in my answer. Good approach. Upvoted too. — Am_I_Helpful, Jun 18 '15 at 14:49

Am_I_Helpful · Answer 2 · 2015-06-12T22:11:45.807

for (int i = 0; i <  cubeMatrix[cubeNumber].length; i++)

This will tell about the number of characters in the given cubes(or rather faces of the cube).

Also, inside this loop, you have a

if (frequency > 0) {
   charFreq.put(cubeMatrix[cubeNumber][i], frequency - 1);
   if (findWordExists(cubeMatrix, charFreq, cubeNumber + 1)) {
      return true;
          ..
          // and so on.

This will result in a recursive call, thereby calling for the cubeNumber+1, then cubeNumber+1+1,.., and so on.

And, at last when this condition

if (cubeNumber == cubeMatrix.length) {
        for (Integer frequency : charFreq.values()) {
            if (frequency > 0) return false;
        }
        return true;
    }

will meet, the for-loop wont't get executed any further.

Assuming, no. of cubes = n, and the characters stored in each cube = generalised faces of each cube(as coined by OP) = f.

WORST CASE ANALYSIS :-

Starting from 0th cube to (n-1)th cube, the for-loop will iterate for cubeMatrix[cubeNumber].length times, which is equal to the number of characters stored in each cube = f times.

And, in each of the iteration of the for-loop, the actual recursive call in the case of cubeNumber 0 will be n-1 times, till it reaches the last cube number.

Hence, for each character entry in the cubeArray(f characters), we have to call all the cubes available(total n as per our assumption).

Hence, total number of times the code checks for finding a word = f ^ n.

In your terms, f = cubefaces = number of characters possible on the faces of your generalised cube;

and , n = total number of cubes available for your test.

It does depend on the frequency of characters which is reduced based upon the character in the word when the word length doesn't match with the number of cubes.In that case, the result will be false.

But, in those case where word length is equal to the number of cubes, in the worst case, the output will be independent of the word length.

Strictly, it will also depend on the number of characters of the word(as comparison with frequency will reduce several cases in calculation), but, in worst-case scenario, unfortunately, it doesn't depend on the number of characters in the word as we will be checking all the entries of characters in all of the available cubes to create the word.

But we can make the length of the word a parameter of the time complexity function, so we will be able to derive the asymptotic complexity with greater accuracy. See [my analysis](http://stackoverflow.com/a/30817064/2266855) for the case when the word length equals the number of the cubes. And the later is the most important case, because for the other cases we can simply `return false` without any complex calculations. — dened, Jun 16 '15 at 06:42
@dened- Which is what I mentioned in my answer. See, the quoted-text in the second & third last paragraph. — Am_I_Helpful, Jun 16 '15 at 06:46
But you didn't mention how it changes asymptotic complexity. It is not _f^n_ in these cases, is it? — dened, Jun 16 '15 at 06:53
@dened- Well, first of all I talked everywhere about worst-case complexity as the dependency of word's characters can't be judged exactly; next- I don't see how you can calculate the complexity of the cases when you have conditional statements, means dependent on n OR f. When you don't have `f` OR `n`(given earlier/given at compile-time), I ***can't*** assume anything and evaluate the answer. I'll have to talk about the ***worst-case somplexity only***. — Am_I_Helpful, Jun 16 '15 at 09:25
Actually, when doing an algorithm analysis you very often can make it more accurate if you take into account conditional statements. For example, I successfully do it in my analysis, and there I also talk about the worst case analysis only. — dened, Jun 16 '15 at 10:05

What is the complexity of the code to find word in a set of cubes

2 Answers2

A bit more details

Better Algorithm