You should accept the answer by @BrenBarn as it covers the important stuff. But here are a few more tips:
- There is an easier and faster way to make the initial
x
list.
Let Python do it for you. Just wrap the range()
in list()
like so:
MAX_NUM = 100
x = list(range(2, MAX_NUM))
We will have more opportunities to use MAX_NUM
later; read on.
- In Python, it is best practice to use a
for
loop with range()
instead of a while
loop adding to an index variable.
Instead of:
i = 2
while k*i <= max(x):
# do stuff with k*i
i += 1
try this:
for i in range(k*2, max(x), k):
# do stuff with i
The Python built-in range()
will make a series of values for you, starting with k*2
, adding k
each time, and stopping with the last multiple of k
that is less than max(x)
. Now your loop runs faster and you avoid a bunch of multiplies.
- Instead of indexing
x[0]
to get k
and then using x.pop()
to discard the k
value from x
, there is a simpler way.
list.pop()
returns the popped value. So you can do it in one line like so:
k = x.pop()
- You are computing
max(x)
many times. But you already know the largest number in x
, because you built x
. At the time you build x
, you can save the highest number in a variable, and use this variable instead of finding max(x)
over and over. The good part about max(x)
is that it won't check numbers that have been pulled out; for example, when k
is 3, 99 will be removed. But max(x)
is expensive so I think it is overall a win to just use a saved value.
This is why I saved MAX_NUM
. So you can do this:
for i in range(k*2, MAX_NUM, k):
# do stuff with i
- If you are just starting out, you may not know about Python's
set()
class, but it is good for this problem. As dstromberg said in his/her answer, x.remove(some_value)
is slow when x
contains many values. But removing values from a set
is always a very fast operation.
You can build a set like this:
x = set(range(2, 100))
This set will contain all integer values from 2 to 99 inclusive.
Then, one of the neat things about sets: you can get rid of members without checking to see if they are in the set or not, with the .discard()
member function (another cheap thing to do with a set
).
# list solution
if i in my_list:
my_list.remove(i)
# set solution
my_set.discard(i)
Actually, both in
(used on a list) and list.remove()
are expensive. So a set
replaces two expensive operations with a single cheap one!
Once you get your original program into a working state, keep a copy of it, and then rewrite it to use a set
instead of a list
. Increase the biggest integer from 100 to, let's say, 10000 and time both programs. You should notice the difference. A set
takes longer to make than a list
but then you win big-time on the operations (in
tests or removing a value).
- But you may be wondering, "steveha, a
set
cannot be indexed so x[0]
won't work... how do I find k
?"
I suggest simply using a for
loop with a range()
statement that makes the k
values you need. Instead of looking at x[0]
or using k = x.pop()
, you can do this:
for k in range(2, MAX_NUM):
if k not in x:
continue
With a set
the in
test is very fast, so this will quickly skip all non-primes. It's not as clever as the way your original program just always has the next prime ready to go, but overall I think a set
is a win for this problem.
And oh hey, we were able to use MAX_NUM
another time.
- You may also be thinking "A
set
doesn't have .pop()
so how do I get my list of prime numbers when the sieve is finished?"
Couldn't be easier!
result = list(my_set) # get a list of values stored in my_set
So sieve away the non-prime numbers and then just take the list of primes when you are done.
You might want to use slightly better variable names. Personally I like terse one-letter variables if they are used in compact code that stays together, so I would keep i
, but maybe x
should be sieve
and k
should be next_prime
or something.
I'm happy to see that you understand how to use the more advanced features of the print()
function, but I think your print code could be simpler.
Instead of this:
print('primes','\n',primes,sep='')
Try this:
print('primes')
print(primes)
Or maybe this:
print('primes:\n{}'.format(primes))
The actual program that uses all of the above advice to compute the Sieve of Eratosthenes is much shorter than the advice was! I have it written and tested but I won't post it unless you would like me to do so. My solution (not counting the print()
calls or blank lines) is 8 lines of Python, and the first line is: MAX_NUM = 100
(EDIT: It was six lines, but I didn't have the check to see if k
is in the set or not, so it was slow. The check added two more lines.)
When you are done, compare your original with the revised one. Which one do you prefer? Does one seem easier to understand than the other?
One of the things I love about Python is that when a program uses the built-in features of Python effectively, it becomes a simpler and more beautiful program that is easier to understand.
Good luck and have fun!