2

What is the Big O notation of this function (as a function of n = len(lst)):

def foo(lst):
    jump = 1
    total = 0
    while jump < len(lst):
        for i in range(0, len(lst), jump):
            total += i
        jump = jump * 2
    return total

I thought it was O(n*log(n)) but it's not, trying to understand why it's actually O(n)... I'm kinda new in this so if you could also explain how you got to the answer it'll be the best! thanks

2 Answers2

2

You keep doubling your step: 1, 2, 4, 8, ...

So those ranges range(0, n, step) that you go through have sizes n, n/2, n/4, n/8, ... (well, approximately - rounded to ints).

The sum of those is 2n (approximately). Which is O(n).

If you didn't know that sum yet, yet, it's easy to see: You start with 0. Then n gets you halfway to 2n. Adding n/2 gets you to 1.5n, again halfway to 2n. Adding n/4 gets you to 1.75, again halfway to 2n. And so on. You only get closer and closer to 2n.

Kelly Bundy
  • 23,480
  • 7
  • 29
  • 65
0

The function is O(n).

The function makes about 2*n iterations to got into details (but 2*n is n in big O notation).

For each while loop you're jumping to the next power of 2, so for example for n=1000:

[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]

Then for each subloop, you go through all element, thus effectively doing sum(successive_powers) (in the example 1+2+4+…+512 = 1023).

The sum of the n first powers is roughly the power of the n+1, so here: 2**(log2(n)+1) ~ 2*n.

An easy way to check this is to change the code:

def foo(lst):
    jump = 1
    total = 0
    while jump < len(lst):
        for i in range(0, len(lst), jump):
            total += 1 # count instead of summing
        jump = jump * 2
    return total

foo(list(range(1000)))
# 2000
mozway
  • 194,879
  • 13
  • 39
  • 75
  • Not my downvote but the sentence “Roughly O(2*n) to be precise” is nonsensical: O(2*n) is *exactly* O(n) because of the definition of the big-O function. Also, purely linguistically, “roughly” and “to be precise” are opposites. Furthermore, your “way of checking” only works if you actually test it with multiple inputs. A single input won’t tell you how the function’s runtime grows. – Konrad Rudolph Feb 10 '22 at 12:16
  • @Konrad I see, the roughly applied to 2*n, it can be a few units more or less due to the powers of 2. I'll update. Nevertheless I believe the general logic is correct, language shouldn't be a reason for DV. The example was just for illustration, of course OP should test several values. The mathematical logic is valid for any n – mozway Feb 10 '22 at 12:19
  • I've seen similar downvotes in a different question. I think someone is downvoting correct answers to elementary homework-style complexity theory questions rather than finding dupes and closing them. – Paul Hankin Feb 10 '22 at 12:20
  • @PaulHankin Full disclosure, I initially had voted to close the question, but OP made efforts in adding details so I rather chose to answer. – mozway Feb 10 '22 at 12:22
  • What do you mean with "approximately"? Those powers of 2 are exact and are exactly what happens. – Kelly Bundy Feb 10 '22 at 12:25
  • @Kelly I had provided a quick answer and now everyone seems to jump on language details ;) – mozway Feb 10 '22 at 12:26
  • Got it, thanks :) – just got here Feb 10 '22 at 12:28
  • *"for each subloop, you go through all element"* - I find this possibly misleading, as in my opinion, "all element" rather sounds like you mean elements of the list. Which is not true. And if you mean elements of the range, the "all" sounds strange. And `sum(successive_powers)` is at least misleading, but to me it just looks wrong. Their number of steps isn't 1+2+4+8+etc but rather n+n/2+n/4+n/8+etc. – Kelly Bundy Feb 10 '22 at 12:30
  • In other words, your "1+2+4+8+..." sounds like you think that they're using `range(0, jump)` instead of `range(0, n, jump)`. – Kelly Bundy Feb 10 '22 at 12:36
  • @Kelly, No, it's `(1)+(1+2)+(1+2+4)+…` but as `1+2+…+2**n` ~ `2**(n+1)` then you have the sum of the powers again. It's a sum of sum of power, which is a shifted sum of power (thus the 2*n). Anyway, I agree the language wasn't the best, still believe the logic is true. – mozway Feb 10 '22 at 12:38
  • The logic looks wrong to me. And I don't see where your (1)+(1+2)+(1+2+4)+... comes from. But in either case, the *increasing* powers look like they spend 1 step in the first subloop, then 2 (or 1+2) in the second subloop, etc. And that's just not what happens. The first subloop does n steps, the second does n/2, etc. – Kelly Bundy Feb 10 '22 at 12:53
  • @Kelly I don't know what to tell you, I'd draw you a schematic if we were around a beer. Now it's probably not worth spending more time on this – mozway Feb 10 '22 at 12:57
  • I mean, sure, their `jump` value does increase. Powers of 2, yes. But *why are you summing the `jump` values?* That's not how long each subloop takes. Each subloop takes `n/jump` time, not `jump` time. You really seem to be talking about a similar but different algorithm that just happens to have the same complexity. – Kelly Bundy Feb 10 '22 at 13:11