1

I've recently started learning Python. My apologies if this is really obvious.

I am following along with the 2008 MIT open course on Computer Science and am working on the problem of calculating the 1000th prime integer. Python 2.7.3, Win7 lappy (cough, cough...)

Here's the code I came up with:

num = 3
primeList = [2]

while len(primeList) < 1000:
    for i in primeList:
        if num % i == 0:
            break
    else:
        primeList.append(num)
    num += 1

print "The 1,000th PRIME integer is", primeList[999]

One of the assignment conditions was to only check odd numbers. Given the starting num is three, I figured it would be easy enough to simply change num+=1 to num+=2. Of note: I won't bore you with the detailed code I composed, but while writing this I was using a very verbose mode of printing out the results of each check, whether or not it was prime, which number was being checked, which integer divided into it if it wasn't prime & such (again, sorry - newB!)

At this point I became curious to test if this was actually taking less time to compute - seemed like if half the numbers are being checked for primacy, it should take half the time, no?

I imported the time module to check how long this was taking. Computing to the 1000th was pretty quick either way so I increased the number of primes I was searching for to the 10,000th and didn't see any significant difference. between num+=1 & num+=2

import time
start = time.time()

num = 3
primeList = [2]

while len(primeList) < 10000:
    for i in primeList:
        if num % i == 0:
            break
    else:
        primeList.append(num)
    num += 2

print "The 10,000th PRIME integer is", primeList[9999]
end = time.time()
print "That took %.3f seconds" % (end-start)

Sometimes the n+=2 even took a couple milliseconds longer. ?. I thought this was odd and was wondering if someone could help me understand why - or, more to the point: how?

Furthermore, I next imported the sqrt() function thinking this would reduce the number of integers being checked before confirming primacy, but this doubled the runtime =^O.

import time
start = time.time()

from math import sqrt

num = 3
primeList = [2]

while len(primeList) < 100000:
    for i in primeList:
        if i <= sqrt(num):
            if num % i == 0:
                break
    else:
        primeList.append(num)
    num += 2

print "The 100,000th PRIME integer is",primeList[99999]
end = time.time()
print 'that took', end - start, "seconds, or", (end-start)/60, "minutes"

Certainly - it might be the way I've written my code! If not, I'm curious what exactly I am invoking here that is taking so long?

Thank you!

MmmHmm
  • 3,435
  • 2
  • 27
  • 49
  • 2
    First of all, use the `timeit` module. Second of all, if you are going to use your current method using `time.time()`, put the `start` and `end` calls as close to the start and end as possible; for instance, don't get the start time _before_ you import `sqrt` (it takes some non-zero time), do it _after_. Third, you're calling `sqrt(n)` every iteration, whereas you could compute it once and save the result. – Cyphase Aug 14 '15 at 02:21
  • 1
    Eratosthenes Sieve would be a heck of a lot faster if you don't mind the storage trade-off, although it's a little tricky to decide how large to make it when you want the 1000th prime. Wouldn't take too much extra smarts to make a dynamically-sized sieve though. – paddy Aug 14 '15 at 02:24
  • 3
    Do you mean `sqrt(num)`? – BrenBarn Aug 14 '15 at 02:24
  • 1
    For even numbers the first divide (2) would eliminate it, so even numbers require only 1 extra divide by per number, which accounts for the small amount of time difference. Also, the length of the prime list has to be calculated each time. So in the next to last pass, 99,999 numbers have to be counted, 99,998 in the pass before, etc. Instead, try adding one to a counter each time a prime is found and while counter < 100000 which is one simple compare instead of 100,000 length of the list. –  Aug 14 '15 at 02:32
  • Thank you Cyphase, I'll look into **timeit**, making note of time.time() useage and - yes, will review my syntax and see if I can streamline the sqrt(num) check! – MmmHmm Aug 14 '15 at 02:39
  • paddy - yeah, I read about Eratosthenes Sieve and quickly realized I didn't have the exxxtra smarts for a dynamic sieve... One day! – MmmHmm Aug 14 '15 at 02:40
  • BrenBarn - thank you, yes I meant sqrt(num)1 – MmmHmm Aug 14 '15 at 02:40
  • Curly joe, thank you for the suggestion! I had generated a list initially so I could retrace my steps and verify what I was generating, but yes - time to let go of that security blanket :-D – MmmHmm Aug 14 '15 at 02:42
  • @CurlyJoe: This isn't Lisp; finding the length of a list is O(1), since lists keep track of their length. – user2357112 Aug 14 '15 at 02:43
  • @CurlyJoe, `len(a_list)` is O(1) (constant time), so it's not nearly as bad as you indicated. It would help a tiiny bit though. – Cyphase Aug 14 '15 at 02:45

1 Answers1

1

Two things.

First, you're calculating sqrt(n) on every loop iteration. This will add work, because it's something else your code now has to do on every pass through the loop.

Second, the way you're using sqrt doesn't reduce the number of numbers it checks, because you don't exit the loop even when i is bigger than sqrt(n). So it keeps doing a do-nothing loop for all the higher numbers.

Try this instead:

while len(primeList) < 100000:
    rootN = sqrt(num)
    for i in primeList:
        if i <= rootN:
            if num % i == 0:
                break
        else:
            primeList.append(num)
            break
    else:
        primeList.append(num)
    num += 2

This finds 100,000 primes in about 3 seconds on my machine.

BrenBarn
  • 242,874
  • 37
  • 412
  • 384