You've already got a proper answer to your question. For completeness, the other option is just to add the unneeded clause that we know will never be called:
primes = next [2 ..]
where
next (p : xs) =
p : next [x | x <- xs, mod x p > 0]
next _ = undefined
Another, more "old-style" solution, is to analyze the argument by explicit calls to head
and tail
(very much not recommended, in general):
primes = next [2 ..]
where
next xs = let { p = head xs } in
p : next [x | x <- tail xs, mod x p > 0]
This could perhaps count as a simplification.
On an unrelated note, you write that it "works well". Unfortunately, while indeed producing the correct results, it does so very slowly. Because of always taking only one element at a time off the input list, its time complexity is quadratic in the number n of primes produced. In other words, primes !! n
takes time quadratic in n
. Empirically,
> primes !! 1000
7927 -- (0.27 secs, 102987392 bytes)
> primes !! 2000
17393 -- (1.00 secs, 413106616 bytes)
> primes !! 3000
27457 -- (2.23 secs, 952005872 bytes)
> logBase (2/1) (1.00 / 0.27)
1.8889686876112561 -- n^1.9
> logBase (3/2) (2.23 / 1.00)
1.9779792870810489 -- n^2.0
In fact the whole bunch of the elements may be taken from the input at once, up to the square of the current prime, with the code thus taking only about ~ n1.5 time, give or take a log factor:
{-# LANGUAGE ViewPatterns #-}
primes_ = 2 : next primes_ [3 ..]
where
next (p : ps) (span (< p*p) -> (h, xs)) =
h ++ next ps [x | x <- xs, mod x p > 0]
next _ _ = undefined
Empirically, again, we get
> primes !! 3000
27457 -- (0.08 secs, 29980864 bytes)
> primes !! 30000
350381 -- (1.81 secs, 668764336 bytes)
> primes !! 60000
746777 -- (4.74 secs, 1785785848 bytes)
> primes !! 100000
1299721 -- (9.87 secs, 3633306112 bytes)
> logBase (6/3) (4.74 / 1.81)
1.388897361815054 -- n^1.4
> logBase (10/6) (9.87 / 4.74)
1.4358377567888103 -- n^1.45
As we can see here, the complexity advantage expresses itself as an enormous speedup in absolute terms as well.
So then this sieve is equivalent to the optimal trial division, unlike the one in the question. Of course when it was first proposed in 1976, Haskell had no view patterns yet, and in fact there was yet no Haskell itself.