52
int sum = 0;
for(int i = 1; i < n; i++) {
    for(int j = 1; j < i * i; j++) {
        if(j % i == 0) {
            for(int k = 0; k < j; k++) {
                sum++;
            }
        }
    }
}

I don't understand how when j = i, 2i, 3i... the last for loop runs n times. I guess I just don't understand how we came to that conclusion based on the if statement.

Edit: I know how to compute the complexity for all the loops except for why the last loop executes i times based on the mod operator... I just don't see how it's i. Basically, why can't j % i go up to i * i rather than i?

user11452926
  • 641
  • 5
  • 9
  • hey its calculated exponentially like for first 2 it will be n 2 and *2 2*2=4 – Vipul Pandey Feb 11 '20 at 07:10
  • 5
    You can reduce the complexity of this code by multiple **large** factors. **Hint**: The sum of numbers 1 to n is ((n+1)*n)/2 **Hint 2**: `for (j = i; j < i *i; j += i)` then you don't need the modulus test (because `j` is guaranteed to be divisible by `i`). – Elliott Frisch Feb 11 '20 at 07:14
  • 1
    O() function is a ball-park function so any loop in this example is adding to complexity. The second loop is running up to n^2. if-statements are ignored. – Christoph Bauer Feb 11 '20 at 07:14
  • 11
    @ChristophBauer `if` statements are **absolutely not** ignored. This `if` statement means the complexity is O(n^4) instead of O(n^5), because it causes the innermost loop to only execute `i` times instead of `i*i` times for each iteration of the second loop. – kaya3 Feb 11 '20 at 07:15
  • Hint3: If we treat this as a function `sum(n)` then the value returned by `sum(n)` will have the same complexity as the cost for computing `sum(n)`. So if you can work out an algebraic formula for `sum(n)`, that will give you the computational complexity. – Stephen C Feb 11 '20 at 07:21
  • 1
    @kaya3 totally missed the `k < n^2` part.So it is O(n^5) but knowledge (by understanding the `if`) suggests O(n^4). – Christoph Bauer Feb 11 '20 at 07:24
  • Along the lines of @ElliottFrisch 's comment, the loop over `j` could equivalently be rewritten with a new variable `jprime` as `for(jprime=1; jprime – Gregory Puleo Feb 11 '20 at 17:09
  • 1
    If this isn't just a class exercise, change the second loop to for(int j = i; j < i * i; j+=i) – Cristobol Polychronopolis Feb 11 '20 at 18:43
  • 1
    j % i, or j mod i, can only go up to i (actually, i-1) because the modulus of one number by a second number, call it in general "b", is always and only in the range 0...b-1, inclusive. This is, in effect, part of the definition of "modulo". – The_Sympathizer Feb 11 '20 at 21:41
  • @ElliottFrisch Indeed. If you continue with this logic, you can refactor the 4 loops to just `(n - 1) * n * (n - 2) * (3 * n - 1) / 24`. – Eric Duminil Feb 12 '20 at 12:35

5 Answers5

53

Let's label the loops A, B and C:

int sum = 0;
// loop A
for(int i = 1; i < n; i++) {
    // loop B
    for(int j = 1; j < i * i; j++) {
        if(j % i == 0) {
            // loop C
            for(int k = 0; k < j; k++) {
                sum++;
            }
        }
    }
}
  • Loop A iterates O(n) times.
  • Loop B iterates O(i2) times per iteration of A. For each of these iterations:
    • j % i == 0 is evaluated, which takes O(1) time.
    • On 1/i of these iterations, loop C iterates j times, doing O(1) work per iteration. Since j is O(i2) on average, and this is only done for 1/i iterations of loop B, the average cost is O(i2 / i) = O(i).

Multiplying all of this together, we get O(n × i2 × (1 + i)) = O(n × i3). Since i is on average O(n), this is O(n4).


The tricky part of this is saying that the if condition is only true 1/i of the time:

Basically, why can't j % i go up to i * i rather than i?

In fact, j does go up to j < i * i, not just up to j < i. But the condition j % i == 0 is true if and only if j is a multiple of i.

The multiples of i within the range are i, 2*i, 3*i, ..., (i-1) * i. There are i - 1 of these, so loop C is reached i - 1 times despite loop B iterating i * i - 1 times.

kaya3
  • 47,440
  • 4
  • 68
  • 97
  • 2
    In O(n × i^2 × (1 + i)) why 1+i ? – Soleil Feb 11 '20 at 21:24
  • 3
    Because the `if` condition takes O(1) time on every iteration of loop B. It's dominated by loop C here, but I counted it above so it's just "showing my working". – kaya3 Feb 12 '20 at 00:56
15
  • The first loop consumes n iterations.
  • The second loop consumes n*n iterations. Imagine the case when i=n, then j=n*n.
  • The third loop consumes n iterations because it's executed only i times, where i is bounded to n in the worst case.

Thus, the code complexity is O(n×n×n×n).

I hope this helps you understand.

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Mohammed Deifallah
  • 1,290
  • 1
  • 10
  • 25
5

All the other answers are correct, I just want to amend the following. I wanted to see, if the reduction of executions of the inner k-loop was sufficient to reduce the actual complexity below O(n⁴). So I wrote the following:

for (int n = 1; n < 363; ++n) {
    int sum = 0;
    for(int i = 1; i < n; ++i) {
        for(int j = 1; j < i * i; ++j) {
            if(j % i == 0) {
                for(int k = 0; k < j; ++k) {
                    sum++;
                }
            }
        }
    }

    long cubic = (long) Math.pow(n, 3);
    long hypCubic = (long) Math.pow(n, 4);
    double relative = (double) (sum / (double) hypCubic);
    System.out.println("n = " + n + ": iterations = " + sum +
            ", n³ = " + cubic + ", n⁴ = " + hypCubic + ", rel = " + relative);
}

After executing this, it becomes obvious, that the complexity is in fact n⁴. The last lines of output look like this:

n = 356: iterations = 1989000035, n³ = 45118016, n⁴ = 16062013696, rel = 0.12383254507467704
n = 357: iterations = 2011495675, n³ = 45499293, n⁴ = 16243247601, rel = 0.12383580700180696
n = 358: iterations = 2034181597, n³ = 45882712, n⁴ = 16426010896, rel = 0.12383905075183874
n = 359: iterations = 2057058871, n³ = 46268279, n⁴ = 16610312161, rel = 0.12384227647628734
n = 360: iterations = 2080128570, n³ = 46656000, n⁴ = 16796160000, rel = 0.12384548432498857
n = 361: iterations = 2103391770, n³ = 47045881, n⁴ = 16983563041, rel = 0.12384867444612208
n = 362: iterations = 2126849550, n³ = 47437928, n⁴ = 17172529936, rel = 0.1238518469862343

What this shows is, that the actual relative difference between actual n⁴ and the complexity of this code segment is a factor asymptotic towards a value around 0.124... (actually 0.125). While it does not give us the exact value, we can deduce, the following:

Time complexity is n⁴/8 ~ f(n) where f is your function/method.

  • The wikipedia-page on Big O notation states in the tables of 'Family of Bachmann–Landau notations' that the ~ defines the limit of the two operand sides is equal. Or:

    f is equal to g asymptotically

(I chose 363 as excluded upper bound, because n = 362 is the last value for which we get a sensible result. After that, we exceed the long-space and the relative value becomes negative.)

User kaya3 figured out the following:

The asymptotic constant is exactly 1/8 = 0.125, by the way; here's the exact formula via Wolfram Alpha.

TreffnonX
  • 2,924
  • 15
  • 23
  • 5
    Of course, O(n⁴) * 0.125 = O(n⁴). Multiplying the runtime by a positive constant factor doesn't change the asymptotic complexity. – Ilmari Karonen Feb 11 '20 at 17:18
  • This is true. However I was trying to reflect the actual complexity, not the upperbound estimate. As I found no other syntax for expressing time complexity other than O-notation, i fell back on that. It is however not a 100% sensible to write it like this. – TreffnonX Feb 12 '20 at 06:02
  • You can use [little-o notation](https://en.wikipedia.org/wiki/Big_O_notation#Little-o_notation) to say the time complexity is `n⁴/8 + o(n⁴)`, but it's possible to give a stricter expression `n⁴/8 + O(n³)` with big O anyway. – kaya3 Feb 12 '20 at 07:35
  • @TreffnonX big OH is a mathematical solid concept. So what you're doing is fundementally wrong/meaningless. Of course you're free to redefine mathematical concepts, but that's a big can of worms you're opening then. The way to define it in a stricter context is what kaya3 described, you go an order "lower" and define it that way. (Though in mathematics you typically use the reciprocate). – paul23 Feb 12 '20 at 07:39
  • You are correct. I corrected myself again. This time, I use the asymtotic growth towards the same limit, as defined in the Family of Bachmann-Landau notations on https://en.wikipedia.org/wiki/Big_O_notation#Little-o_notation . I hope this is now mathematically correct enough to not incite revolt ;) – TreffnonX Feb 12 '20 at 07:53
2

Remove if and modulo without changing the complexity

Here's the original method:

public static long f(int n) {
    int sum = 0;
    for (int i = 1; i < n; i++) {
        for (int j = 1; j < i * i; j++) {
            if (j % i == 0) {
                for (int k = 0; k < j; k++) {
                    sum++;
                }
            }
        }
    }
    return sum;
}

If you're confused by the if and modulo, you can just refactor them away, with j jumping directly from i to 2*i to 3*i ... :

public static long f2(int n) {
    int sum = 0;
    for (int i = 1; i < n; i++) {
        for (int j = i; j < i * i; j = j + i) {
            for (int k = 0; k < j; k++) {
                sum++;
            }
        }
    }
    return sum;
}

To make it even easier to calculate the complexity, you can introduce an intermediary j2 variable, so that every loop variable is incremented by 1 at each iteration:

public static long f3(int n) {
    int sum = 0;
    for (int i = 1; i < n; i++) {
        for (int j2 = 1; j2 < i; j2++) {
            int j = j2 * i;
            for (int k = 0; k < j; k++) {
                sum++;
            }
        }
    }
    return sum;
}

You can use debugging or old-school System.out.println in order to check that i, j, k triplet is always the same in each method.

Closed form expression

As mentioned by others, you can use the fact that the sum of the first n integers is equal to n * (n+1) / 2 (see triangular numbers). If you use this simplification for every loop, you get :

public static long f4(int n) {
    return (n - 1) * n * (n - 2) * (3 * n - 1) / 24;
}

It is obviously not the same complexity as the original code but it does return the same values.

If you google the first terms, you can notice that 0 0 0 2 11 35 85 175 322 546 870 1320 1925 2717 3731 appear in "Stirling numbers of the first kind: s(n+2, n).", with two 0s added at the beginning. It means that sum is the Stirling number of the first kind s(n, n-2).

Eric Duminil
  • 52,989
  • 9
  • 71
  • 124
1

Let's have a look at the first two loops.

The first one is simple, it's looping from 1 to n. The second one is more interesting. It goes from 1 to i squared. Let's see some examples:

e.g. n = 4    
i = 1  
j loops from 1 to 1^2  
i = 2  
j loops from 1 to 2^2  
i = 3  
j loops from 1 to 3^2  

In total, the i and j loops combined have 1^2 + 2^2 + 3^2.
There is a formula for the sum of first n squares, n * (n+1) * (2n + 1) / 6, which is roughly O(n^3).

You have one last k loop which loops from 0 to j if and only if j % i == 0. Since j goes from 1 to i^2, j % i == 0 is true for i times. Since the i loop iterates over n, you have one extra O(n).

So you have O(n^3) from i and j loops and another O(n) from k loop for a grand total of O(n^4)

Silviu Burcea
  • 5,103
  • 1
  • 29
  • 43
  • I know how to compute the complexity for all the loops except for why the last loop executes i times based on the mod operator... I just don't see how it's i. Basically, why can't j % i go up to i * i rather than i? – user11452926 Feb 11 '20 at 07:42
  • 1
    @user11452926 let's say the i was 5. j would go from 1 to 25 in the 2nd loop. However, `j % i == 0` only when j is 5, 10, 15, 20 and 25. 5 times, like the value of i. If you would write down the numbers from to 1 to 25 in 5 x 5 square, only the 5th column would contain the numbers divisible by 5. This works for any number of i. Draw a square of n by n using the numbers 1 to n^2. The nth column will contain the numbers divisible by n. You have n rows, so n numbers from 1 to n^2 divisible by n. – Silviu Burcea Feb 11 '20 at 08:39
  • Thanks! makes sense! What if it was an arbitrary number like 24 rather than 25, will the square trick still work? – user11452926 Feb 11 '20 at 09:08
  • 25 comes when `i` hits 5, so the `j` loops from 1 to 25, you can't choose an arbitrary number. If your 2nd loop would go to a fixed number, e.g. 24, instead of `i * i`, that would be a constant number and wouldn't be tied to `n`, so it would be `O(1)`. If you're thinking about `j < i * i` vs. `j <= i * i`, that will not matter much, as there will be `n` and `n-1` operations, but in the Big-oh notation, both means `O(n)` – Silviu Burcea Feb 11 '20 at 13:04