Haskell Sieve of Eratosthenes with list of composites

Question

I have to implement the classic problem of the Sieve of Eratosthenes in Haskell for a project. Rather than computing each prime I only have to compare numbers between lists. For instance I pass a list of potential primes (parameter 1) and a list of composites (list 2). sieve [2..10] [] results with the list [2,3,5,7].

I think I am very close and it compiles, but it appends every item to the prime list rather than throwing out the composites. My thinking was that it would take the list x of all numbers 2..10 or whatever and a list y of composites use elem to see if the head of list x is found in list y and if so append to list z and print. Thanks in advance!

Currently my code returns everything in the first list and refuses to sort. sieve [2..10] [] results in [2,3,4,5,6,7,8,9,10]

sieve ::[Int]->[Int]->[Int]
z = []
sieve [] [] = []
sieve x [] = x
sieve [] y = y
sieve xs ys = if ((elem (head xs)) ys) then (sieve (tail xs) ys)
    else ((head xs):z)

Your invocation of `sieve [2..10] []` pattern matches with the definition `sieve x [] = x`, so none of the other clauses for `sieve` are being used. — ErikR, May 14 '15 at 03:16
The program you show doesn't make much sense, first of all `sieve x []` will always be used, furthermore in you should check if an element is divisable by the other list. Finally you should make the call recursive, something you don't do with `head xs : z` since `z` is defined as the empty list. — Willem Van Onsem, May 14 '15 at 04:17
no divisibility checks are to be performed in the sieve of Eratosthenes!! — Will Ness, May 14 '15 at 10:01
@WillNess: then (a) why does [this implementation](http://en.literateprograms.org/Sieve_of_Eratosthenes_%28Haskell%29#A_first_look) uses divisibility checks, (b) if you add elements to `ys` for most applications, the elements will be smaller, so the sieve would allow all elements; and (3) this wiki article clearly states [divisibility](http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes#Trial_division). You can indeed implement a version where composites are calculates for `ys`, but that will reduce the performance of the algorithm. — Willem Van Onsem, May 14 '15 at 14:57
@CommuSoft (a) is not a sieve of Eratosthenes, it is a sieve of trial division. (b) didn't understand you, sorry. (c) yes, it says "It is often confused with the sieve of Eratosthenes" i.e. it is *not* the sieve of Eratosthenes, it is *confused* for it. If you haven's seen [this page](https://wiki.haskell.org/Prime_numbers) yet, maybe it will be an interesting read. :) — Will Ness, May 14 '15 at 15:39

Willem Van Onsem · Answer 1 · 2015-05-14T16:00:47.820

The program you show doesn't make much sense, first of all sieve x [] will always be used, furthermore in you should check if an element is divisable by the other list. Finally you should make the call recursive, something you don't do with head xs : z since z is defined as the empty list.

Let's start with the base case: in case the left list is empty, regardless of the content of the second list, one returns the empty list. Since sieving nothing will result in nothing:

sieve [] _ = []

Next we look for the inductive case, with as pattern:

sieve (x:xs) ds = ...

Now we need to enumerate over the list of already found elements. From the moment any of the found elements can divide x, we know the number is not (relative) prime. This condition is formalized as:

(==) 0 . mod x :: Integral b => b -> Bool

Or to iterate over the list of ds:

any ((==) 0 . mod x) ds

In case such element exists, we simply skip the element, and call the inductive case with sieve xs ds.

In case there is no such element, we add it to the list of ds and emit it. The result is thus: x : sieve xs (x:ds). The inductive case is thus:

sieve (x:xs) ds | any ((==) 0 . mod x) ds = sieve xs ds
                | otherwise = x : sieve xs (x:ds)

We can shorten this a bit by creating a specific variable for sieve xs:

sieve (x:xs) ds | any ((==) 0 . mod x) ds = rec ds
                | otherwise = x : rec (x:ds)
                where rec = sieve xs

The full function is thus:

sieve [] _ = []
sieve (x:xs) ds | any ((==) 0 . mod x) ds = rec ds
                | otherwise = x : rec (x:ds)
                where rec = sieve xs

You can boost the performance in two ways:

Adding x at the end of ds. This is indeed a more expensive operation. But after a while you don't add numbers that often. It is interesting because in that case ys looks like [2,3,5,7,11] instead of [11,7,5,3,2]. Now the chances that a number is divisible by 2 (50%) is greater than a number being divisible by 11 (9.9%). It is better to try to try the test that will succeed most probable first.
Furthermore you can end the check after the dividers you have reached the square root of the number to be tested: if a number is not divisible by a number smaller than such number, it is definitely not divisible by a number greater than the square root.

A more efficient approach is thus:

sieve [] _ = []
sieve (x:xs) ds | any ((==) 0 . mod x) $ takeWhile (\y -> y*y <= x) ds = rec ds
                | otherwise = x : rec (ds++[x])
                where rec = sieve xs

very good explanation - but I think the result will be in the wrong order (which might or might not be a problem for the OP) maybe you can fix that - also this is of course not the *sieve* but I think you know this already — Random Dev, May 14 '15 at 05:50
@CarstenKönig: you mean the order of `ys`? Indeed by appending `x` at the end of `ys` (case 2), you can boost performance a bit since it is more likely a number to be diviseable by `2` than by `3`. But since the order of `ys` does not has an impact on *the result* of the check. If one calls `sieve [2..] []` the prime numbers are ordered. — Willem Van Onsem, May 14 '15 at 15:05

Will Ness · Accepted Answer · 2015-08-21T18:53:26.983

what you call sieve is usually called minus, subtracting the second list from the first, assuming the both are ordered, increasing lists of numbers. then it is enough to compare just the two head elements, without any elem calls.

but it could still work, had you provided a proper definition for z. z=[] is just a placeholder, to make it compile (right?); it's not the right definition. it should have been:

sieve :: [Int] -> [Int] -> [Int]
-- z = []
sieve [] [] = []
sieve x [] = x
sieve [] y = y
sieve xs ys = if ((elem (head xs)) ys) then (sieve (tail xs) z)
    else ((head xs) : sieve (tail xs) ys )
      where
         z = ... -- need to remove (head xs) from ys

For the last comment's task, you could use e.g. delete function.

This still won't produce you a list of primes without the list of composites, so the initial call can not be with the second list empty (or else, you'd get the same first argument back, as you do, because of the sieve x [] = x equation):

primesAmong input = sieve input composites

But what are composites? Eratosthenes's answer is, why, they are multiples of primes! (and trial division says: composites have other primes as their divisors).

Given a prime, say 2, we just count: 2,4,6,...; and for 3, say, it's 3,6,9,12,...; to find its multiples. Let's write it down:

composites = mutliplesOf primes
mutliplesOf primes = [ mult | p <- primes, mult <- [...] ]

This doesn't quite fit: this multiplesOf needs an argument:

primes = primesAmong input
primesAmong input = sieve input (mutliplesOf primes)

We seem to be chasing our own tail here; we don't have primes yet; what can we use instead? Is there a harm in finding multiples of non-primes, as well as primes?

After you do have a running code, try to find a way to use primes after all.

Didn't end up needing the delete function but found a lot of problems with my base case and a few other things. only needed sieve [] y = []. thanks! — trainwizard, May 15 '15 at 02:57

Haskell Sieve of Eratosthenes with list of composites

2 Answers2