2

First of all, this is HOMEWORK, I am currently working on a Sieve of Eratosthenes on python. My program looks like:

x=[]
    for i in range(2,100):
        x.append(i)
primes=[]
i=2

while len(x)!=0:
    k=x[0]
    x.pop(0)
    primes.append(k)
    while k*i in x:
        x.remove(k*i)
        i+=1

print(primes)
print(x)

When my program prints for 'primes', I get:

[2, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39,
41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77,
79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99]

Why are there composite numbers in the list? It looks like the program should work

#

Edited the program, now it looks:

x=[]
for i in range(2,100):
    x.append(i)
primes=[]



while len(x)!=0:
    i=2
    k=x[0]
    x.pop(0)
    primes.append(k)
    while k*i<=max(x):
        x.remove(k*i)
        i+=1
        if k*i not in x:
            i+=1

print('primes','\n',primes,sep='')
print('x','\n',x,sep='')

Still not working, getting an error; "ValueError: list.remove(x): x not in list"

Eed
  • 385
  • 1
  • 2
  • 7

3 Answers3

2

One problem is you only ever add to i. You need to reset i back to 2 every time you pop another element from the list.

That is, at first you pop 2 from the list. You gradually increase i to remove all multiple of 2 from the list. At the end of this, i will be 50. But then you go back and pop 3 from the list, and i is still 50, so it only looks for 50*3 in the list.

Even if you fix this, though, it still won't work, because you stop looking at i values as soon as you find one that isn't in the list. But it's possible that k*i isn't in the list but k*(i+1) is -- for instance, after you seive multiples of 2, the first multiple of 3 (namely 6) isn't in the list, but the next (namely 9) is. So you can't stop until you try every multiple up to the list max.

BrenBarn
  • 242,874
  • 37
  • 412
  • 384
  • Thank YOU, I've made slight changes to the program which I think took care of the problems you mentioned, but still doesn't seem to work – Eed Nov 13 '13 at 06:02
  • @Eed: You still need to check if `k*i` is in the list before you try to remove it. – BrenBarn Nov 13 '13 at 06:06
1

Comments:

  • Should use more descriptive variable names
  • Need to restart i at 2 each time entering the k * i in x loop
  • x.pop(0) is slow for large x
  • x.remove(k * i) is slow for large x
  • Should change inner while loop to "while k * i < top" and add an "if k * i in x". top is 100.

Here's a working and fast sieve that uses a bit list: http://stromberg.dnsalias.org/svn/sieve/trunk/

dstromberg
  • 6,954
  • 1
  • 26
  • 27
1

You should accept the answer by @BrenBarn as it covers the important stuff. But here are a few more tips:

  • There is an easier and faster way to make the initial x list.

Let Python do it for you. Just wrap the range() in list() like so:

MAX_NUM = 100
x = list(range(2, MAX_NUM))

We will have more opportunities to use MAX_NUM later; read on.

  • In Python, it is best practice to use a for loop with range() instead of a while loop adding to an index variable.

Instead of:

i = 2
while k*i <= max(x):
    # do stuff with k*i
    i += 1

try this:

for i in range(k*2, max(x), k):
    # do stuff with i

The Python built-in range() will make a series of values for you, starting with k*2, adding k each time, and stopping with the last multiple of k that is less than max(x). Now your loop runs faster and you avoid a bunch of multiplies.

  • Instead of indexing x[0] to get k and then using x.pop() to discard the k value from x, there is a simpler way.

list.pop() returns the popped value. So you can do it in one line like so:

k = x.pop()
  • You are computing max(x) many times. But you already know the largest number in x, because you built x. At the time you build x, you can save the highest number in a variable, and use this variable instead of finding max(x) over and over. The good part about max(x) is that it won't check numbers that have been pulled out; for example, when k is 3, 99 will be removed. But max(x) is expensive so I think it is overall a win to just use a saved value.

This is why I saved MAX_NUM. So you can do this:

for i in range(k*2, MAX_NUM, k):
    # do stuff with i
  • If you are just starting out, you may not know about Python's set() class, but it is good for this problem. As dstromberg said in his/her answer, x.remove(some_value) is slow when x contains many values. But removing values from a set is always a very fast operation.

You can build a set like this:

x = set(range(2, 100))

This set will contain all integer values from 2 to 99 inclusive.

Then, one of the neat things about sets: you can get rid of members without checking to see if they are in the set or not, with the .discard() member function (another cheap thing to do with a set).

# list solution
if i in my_list:
    my_list.remove(i)

# set solution
my_set.discard(i)

Actually, both in (used on a list) and list.remove() are expensive. So a set replaces two expensive operations with a single cheap one!

Once you get your original program into a working state, keep a copy of it, and then rewrite it to use a set instead of a list. Increase the biggest integer from 100 to, let's say, 10000 and time both programs. You should notice the difference. A set takes longer to make than a list but then you win big-time on the operations (in tests or removing a value).

  • But you may be wondering, "steveha, a set cannot be indexed so x[0] won't work... how do I find k?"

I suggest simply using a for loop with a range() statement that makes the k values you need. Instead of looking at x[0] or using k = x.pop(), you can do this:

for k in range(2, MAX_NUM):
    if k not in x:
        continue

With a set the in test is very fast, so this will quickly skip all non-primes. It's not as clever as the way your original program just always has the next prime ready to go, but overall I think a set is a win for this problem.

And oh hey, we were able to use MAX_NUM another time.

  • You may also be thinking "A set doesn't have .pop() so how do I get my list of prime numbers when the sieve is finished?"

Couldn't be easier!

result = list(my_set)  # get a list of values stored in my_set

So sieve away the non-prime numbers and then just take the list of primes when you are done.

  • You might want to use slightly better variable names. Personally I like terse one-letter variables if they are used in compact code that stays together, so I would keep i , but maybe x should be sieve and k should be next_prime or something.

  • I'm happy to see that you understand how to use the more advanced features of the print() function, but I think your print code could be simpler.

Instead of this:

print('primes','\n',primes,sep='')

Try this:

print('primes')
print(primes)

Or maybe this:

print('primes:\n{}'.format(primes))

The actual program that uses all of the above advice to compute the Sieve of Eratosthenes is much shorter than the advice was! I have it written and tested but I won't post it unless you would like me to do so. My solution (not counting the print() calls or blank lines) is 8 lines of Python, and the first line is: MAX_NUM = 100 (EDIT: It was six lines, but I didn't have the check to see if k is in the set or not, so it was slow. The check added two more lines.)

When you are done, compare your original with the revised one. Which one do you prefer? Does one seem easier to understand than the other?

One of the things I love about Python is that when a program uses the built-in features of Python effectively, it becomes a simpler and more beautiful program that is easier to understand.

Good luck and have fun!

steveha
  • 74,789
  • 21
  • 92
  • 117