49

Does Haskell standard library have a function that given a list and a predicate, returns the number of elements satisfying that predicate? Something like with type (a -> Bool) -> [a] -> Int. My hoogle search didn't return anything interesting. Currently I am using length . filter pred, which I don't find to be a particularly elegant solution. My use case seems to be common enough to have a better library solution that that. Is that the case or is my premonition wrong?

missingfaktor
  • 90,905
  • 62
  • 285
  • 365

6 Answers6

47

The length . filter p implementation isn't nearly as bad as you suggest. In particular, it has only constant overhead in memory and speed, so yeah.

For things that use stream fusion, like the vector package, length . filter p will actually be optimized so as to avoid creating an intermediate vector. Lists, however, use what's called foldr/build fusion at the moment, which is not quite smart enough to optimize length . filter p without creating linearly large thunks that risk stack overflows.

For details on stream fusion, see this paper. As I understand it, the reason that stream fusion is not currently used in the main Haskell libraries is that (as described in the paper) about 5% of programs perform dramatically worse when implemented on top of stream-based libraries, while foldr/build optimizations can never (AFAIK) make performance actively worse.

Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
  • 4
    This is really one of the coolest things about Haskell: that you can specify rewrite rules in your own code that get used by the optimizer, and use them to get awesomely composable and efficient code. – Louis Wasserman Jan 30 '12 at 21:05
  • But the point is, you shouldn't write your own code for this: you should just use `length . filter p`, and trust in the optimizer. – Louis Wasserman Jan 30 '12 at 21:08
  • 2
    I'd add `count pred = length . filter pred` to my utils. – missingfaktor Jan 30 '12 at 21:18
  • 3
    That works too...but a word of advice: you'll get better results if you use an `{-# INLINE count #-}` pragma above your declaration of `count`. (Where "better results" refers to "the optimizer getting more opportunities to do its thing.") – Louis Wasserman Jan 30 '12 at 21:21
  • Thanks! I wasn't aware of that pragma either. I suggest you add that to your answer too. – missingfaktor Jan 30 '12 at 21:23
  • Um. Will someone else besides me confirm that this is actually correct? I'm no longer sure that length is built with foldr to trigger fusion like this. – Louis Wasserman Jan 31 '12 at 16:22
  • 3
    @LouisWasserman: Fusion happens when a "good consumer" consumes the result of a "good producer". Both are listed [here](http://www.haskell.org/ghc/docs/7.2.2/html/users_guide/rewrite-rules.html#id570014), and `length` is not included. See also: [GHC #876: length is not a good consumer](http://hackage.haskell.org/trac/ghc/ticket/876). – hammar Jan 31 '12 at 17:57
  • Right. Uh. Should I delete my answer, or what? I'm really ashamed now of getting so many upvotes for a wrong answer...although I suppose from this bug report that length was at _some_ point a good consumer? – Louis Wasserman Jan 31 '12 at 18:52
  • (@missingfaktor: you'll need to unaccept this answer before I can delete it. Bleahhhhh.) – Louis Wasserman Jan 31 '12 at 19:18
  • @LouisWasserman, I don't think you should delete this answer. Just strike the inaccurate bits. That's the convention. – missingfaktor Jan 31 '12 at 19:27
  • Updated with a good amount of detail on why this doesn't work, and some ways in which it does work (i.e. the `vector` package). – Louis Wasserman Jan 31 '12 at 19:45
7

No, there is no predefined function that does this, but I would say that length . filter pred is, in fact, an elegant implementation; it's as close as you can get to expressing what you mean without just invoking the concept directly, which you can't do if you're defining it.

The only alternatives would be a recursive function or a fold, which IMO would be less elegant, but if you really want to:

foo :: (a -> Bool) -> [a] -> Int
foo p = foldl' (\n x -> if p x then n+1 else n) 0

This is basically just inlining length into the definition. As for naming, I would suggest count (or perhaps countBy, since count is a reasonable variable name).

ehird
  • 40,602
  • 3
  • 180
  • 182
  • +1, It appears then I'll have to define a new function in my utils. Thanks for confirming, and for the advice regarding the name of the function. – missingfaktor Jan 30 '12 at 21:03
  • I would advise against defining a new function in your utils. `length . filter pred` will be more legible. If you're finding yourself repeating it over and over, it's appropriate to define it in a relatively narrow scope (e.g., a `where` binding or an unexported top-level function). But if you put it in an utils module, third parties reading your code will have to dig that up to figure out what your code is doing. – Luis Casillas Jan 30 '12 at 21:19
6

Haskell is a high-level language. Rather than provide one function for every possible combination of circumstances you might ever encounter, it provides you with a smallish set of functions that cover all of the basics, and you then glue these together as required to solve whatever problem is currently at hand.

In terms of simplicity and conciseness, this is as elegant as it gets. So yes, length . filter pred is absolutely the standard solution. As another example, consider elem, which (as you may know) tells you whether a given item is present in a list. The standard reference implementation for this is actually

elem :: Eq x => x -> [x] -> Bool
elem x = foldr (||) False . map (x ==)

In order words, compare every element in the list to the target element, creating a new list of Bools. Then fold the logical-OR function over this new list.

If this seems inefficient, try not to worry about it. In particular,

  1. The compiler can often optimise away temporary data structures created by code like this. (Remember, this is the standard way to write code in Haskell, so the compiler is tuned to deal with it.)

  2. Even if it can't be optimised away, laziness often makes such code fairly efficient anyway.

(In this specific example, the OR function will terminate the loop as soon as a match is seen - just like what would happen if you hand-coded it yourself.)

As a general rule, write code by gluing together pre-existing functions. Change this only if performance isn't good enough.

MathematicalOrchid
  • 61,854
  • 19
  • 123
  • 220
  • Thank you for your answer, +1. I have been using functional programming for long enough, and am well aware of the things you said in your answer, but when I have a pattern as repetitive as the one being discussed here, I factor it out in a separate function. – missingfaktor Jan 31 '12 at 18:17
  • [Here](http://goo.gl/jAOMv) is a philosophy page of Factor programming language, a language where composition is _the_ way of doing things, and most code is written in point-free manner. Have a look at it. Many of those guidelines apply regardless of the language, and I happen to follow them. You'll find similar advice in Leo Brodie's "Thinking Forth", which is a great book on software development in general. – missingfaktor Jan 31 '12 at 18:18
  • 2
    `a smallish set of functions that cover all of the basics, and you then glue these together as required to solve whatever problem is currently at hand` in my book this is the definition of low level – Arnaud Le Blanc Dec 03 '16 at 11:32
2

This is my amateurish solution to a similar problem. Count the number of negative integers in a list l

nOfNeg l = length(filter (<0) l)
main = print(nOfNeg [0,-1,-2,1,2,3,4] ) --2
JayJay
  • 562
  • 12
  • 24
0

No, there isn't!

As of 2020, there is indeed no such idiom in the Haskell standard library yet! One could (and should) however insert an idiom howMany (resembling good old any)

howMany p xs = sum [ 1 | x <- xs, p x ]
-- howMany=(length.).filter

main = print $ howMany (/=0) [0..9]

Try howMany=(length.).filter

Community
  • 1
  • 1
Roman Czyborra
  • 127
  • 1
  • 4
-1

I'd do manually

howmany :: (a -> Bool) -> [a] -> Int 
howmany _ [ ] = 0
howmany pred (x:xs)  = if pred x then 1 + howmany pred xs 
                       else               howmany pred xs
LLD
  • 1