Can this prime sieve code be further simplified in Haskell?

Question

The code works well

primes = next [2 ..]
  where
    next (p : ps) = p : next ts
      where
        ts = filter (\x -> mod x p /= 0) ps

Just GHCI think there is a incomplete patter in next.

Well, this is correct from a grammatical point of view.

But obviously the input of 'next' cannot be empty.

So is there a solution other than adding the declaration ({-# OPTIONS_GHC -Wno-incomplete-patterns #-})?

I make a mistake. the output miss the first `2`. I've been fixed it by changing `head ts` to `p` — Sid, Dec 31 '21 at 16:39

score 2 · Accepted Answer · edited Jan 02 '22 at 05:18

2

The exhaustiveness checker knows that next has type Num a => [a] -> [a]. The empty list is a valid argument to next, even if you never actually call next on the empty list.

The key here is that you don't really want Num a => [a] as your argument type. You know it will only be called on an infinite list, so use a type that doesn't have finite lists as values.

data Stream a = Cons a (Stream a)

sequence :: Num a => a -> Stream a
sequence x = Cons x (sequence (x + 1))

filterStream :: (a -> Bool) -> Stream a -> Stream a
filterStream p (Cons x xs) | p x = Cons x (filterStream p xs)
                           | otherwise = filterStream p xs

-- Since you'll probably want a list of values, not just a stream of them, at some point.
toList :: Stream a -> [a]
toList (Cons x xs) = x : toList xs

primes :: Stream Integer
primes = next (sequence 2)
  where 
    next (Cons x xs) = Cons x xs'
      where xs' = filterStream (\x -> mod x p /= 0) xs

The Stream library provides a module Data.Stream that defines the Stream type and numerous analogs to list functions.

import qualified Data.Stream as S

-- S.toList exists as well.

primes :: Stream Integer
primes = next (S.fromList [2..])
  where next (Cons x xs) = Cons x (S.filter (\x -> mod x p /= 0) xs)

edited Jan 02 '22 at 05:18

Will Ness

70,110
9
98
181

answered Dec 31 '21 at 17:38

chepner

497,756
71
530
681

1

Calling `S.fromList` is just hiding the incomplete pattern match in a library call. `S.iterate (1+) 2` seems like the right away to guarantee totality. – Daniel Wagner Dec 31 '21 at 19:47
a link to`Data.Stream` please. – Will Ness Jan 01 '22 at 04:46
Hm, indeed I can't find it either. It's a well-known educational device, but maybe so simple that nobody uses the simple version in practice? – amalloy Jan 02 '22 at 05:12
@amalloy I've found it now! on [Hoogle](https://hoogle.haskell.org/?=&hoogle=Data.Stream%20is%3Amodule&scope=), I clicked the "is:module" toggle this time and it showed up in the middle of the results list. – Will Ness Jan 02 '22 at 05:14
(couldn't find it before, when I first commented) – Will Ness Jan 02 '22 at 05:24

Will Ness · Answer 2 · 2022-04-09T17:20:15.550

You've already got a proper answer to your question. For completeness, the other option is just to add the unneeded clause that we know will never be called:

primes = next [2 ..]
  where
  next (p : xs) =
       p : next [x | x <- xs, mod x p > 0]
  next _ = undefined

Another, more "old-style" solution, is to analyze the argument by explicit calls to head and tail (very much not recommended, in general):

primes = next [2 ..]
  where
  next xs = let { p = head xs } in
       p : next [x | x <- tail xs, mod x p > 0]

This could perhaps count as a simplification.

On an unrelated note, you write that it "works well". Unfortunately, while indeed producing the correct results, it does so very slowly. Because of always taking only one element at a time off the input list, its time complexity is quadratic in the number n of primes produced. In other words, primes !! n takes time quadratic in n. Empirically,

> primes !! 1000
7927     -- (0.27 secs, 102987392 bytes)
> primes !! 2000
17393    -- (1.00 secs, 413106616 bytes)
> primes !! 3000
27457    -- (2.23 secs, 952005872 bytes)

> logBase (2/1) (1.00 / 0.27)
1.8889686876112561                -- n^1.9
> logBase (3/2) (2.23 / 1.00)
1.9779792870810489                -- n^2.0

In fact the whole bunch of the elements may be taken from the input at once, up to the square of the current prime, with the code thus taking only about ~ n^1.5 time, give or take a log factor:

{-# LANGUAGE ViewPatterns #-}

primes_ = 2 : next primes_ [3 ..] 
  where 
  next (p : ps) (span (< p*p) -> (h, xs)) = 
      h ++ next ps [x | x <- xs, mod x p > 0]
  next _ _ = undefined

Empirically, again, we get

> primes !! 3000
27457     -- (0.08 secs,   29980864 bytes)
> primes !! 30000
350381    -- (1.81 secs,  668764336 bytes)
> primes !! 60000
746777    -- (4.74 secs, 1785785848 bytes)
> primes !! 100000
1299721   -- (9.87 secs, 3633306112 bytes)

> logBase (6/3)  (4.74 / 1.81)
1.388897361815054                 -- n^1.4
> logBase (10/6) (9.87 / 4.74)
1.4358377567888103                -- n^1.45

As we can see here, the complexity advantage expresses itself as an enormous speedup in absolute terms as well.

So then this sieve is equivalent to the optimal trial division, unlike the one in the question. Of course when it was first proposed in 1976, Haskell had no view patterns yet, and in fact there was yet no Haskell itself.

Can this prime sieve code be further simplified in Haskell?

2 Answers2