69

I am looking for a function that returns the first element in a sequence for which an fn evaluates to true. For example:

(first-map (fn [x] (= x 1)) '(3 4 1))

The above fake function should return 1 (the last element in the list). Is there something like this in Clojure?

maiwald
  • 891
  • 12
  • 26
Matthew
  • 11,203
  • 9
  • 38
  • 44

7 Answers7

83
user=> (defn find-first
         [f coll]
         (first (filter f coll)))
#'user/find-first
user=> (find-first #(= % 1) [3 4 1])
1

Edit: A concurrency. :) No. It does not apply f to the whole list. Only to the elements up to the first matching one due to laziness of filter.

kotarak
  • 17,099
  • 2
  • 49
  • 39
  • 4
    Thanks @kotarak, my concern is that this would process all of the elements in the collection, which is undesirable on a large list. Perhaps I need to create a function that recurs until it finds the lement that satisfies the condition. – Matthew Apr 17 '12 at 14:02
  • 6
    @Matthew modulo chunked sequences. There `f` may be applied to more elements depending on the chunk size. – kotarak Apr 17 '12 at 14:32
  • using `filter` or `some` takes more resources. alternatively, `reduced` can shortcut the whole calculation while takes less resources: `(reduce #(when (= %2 50000000) (reduced %2)) nil (range))`. – 象嘉道 Sep 04 '15 at 19:02
  • @xando Did you use something like the following to prove it?`user=> (def input (range)) #'user/input user=> (time (some #{50000000} input)) "Elapsed time: 3909.789411 msecs" 50000000 user=> (time (reduce #(when (= %2 50000000) (reduced %2)) nil input)) "Elapsed time: 198616.007581 msecs" 50000000` These happen to be the results I got. – Marko Topolnik Sep 05 '15 at 06:26
  • @MarkoTopolnik things are becoming interesting. i've run several times continuously, and the performance shows wave, but "reduce" way is faster in my tests: `red> (def input (range)) #'red/input red> (time (reduce #(when (= %2 50000000) (reduced %2)) nil input)) "Elapsed time: 9708.709406 msecs" red> (time (some #{50000000} input)) "Elapsed time: 27345.824046 msecs" red> (def input (range)) #'red/input red> (time (some #{50000000} input)) "Elapsed time: 42612.98816 msecs" red> (time (reduce #(when (= %2 50000000) (reduced %2)) nil input)) "Elapsed time: 6802.505188 msecs"` – 象嘉道 Sep 08 '15 at 18:19
  • @MarkoTopolnik I used JDK 8 and clojure 1.7. – 象嘉道 Sep 08 '15 at 18:21
  • @xando The main point here should be that this is not a reliable way to measure, and it is especially not reliable at establishing the causes of those delays. For example, if I just slightly reduce the count from 50 million to 40 million, the times are both within 5 seconds. Further, the advantage of `reduce` is that it relies on the special-cased ability of a `range` to reduce itself; `some` is not coded in terms of `reduce` (but not essentially so---Clojure just isn't optimized aggressively enough). If you use `(remove odd? (range))`, the advantage of `reduce` is gone. – Marko Topolnik Sep 08 '15 at 18:24
  • @MarkoTopolnik just share some of my findings :-) – 象嘉道 Sep 08 '15 at 18:38
  • 1
    @xando Actually you started out by stating that `some` uses more resources and that `reduced` can shortcut the whole calculation while taking less resources. This is true only in the exceptional case of using a pure `range` or other sequence/collection which can directly reduce itself. Plus, all the alternatives obviously short circuit (`range` being an infinite sequence). – Marko Topolnik Sep 08 '15 at 18:43
  • This solution looks like it would retain the head, and would take up a lot of memory? Is there a solution which takes constant space? – Didier A. Mar 13 '19 at 20:52
57

In your case, the idiom is

(some #{1} [1 2 3 4])

How it works: #{1} is a set literal. A set is also a function evaluating to its arg if the arg is present in the set and to nil otherwise. Any set element is a "truthy" value (well, except for a boolean false, but that's a rarity in a set). some returns the return value of the predicate evaluated against the first collection member for which the result was truthy.

Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436
  • This is useful if you can search utilizing an element in entirety, but `some` will not return compound elements (such as hashmaps) if they are indexed partially. – ctpenrose Sep 24 '18 at 18:10
  • 1
    `(->> [nil nil nil nil 42 nil nil] (some identity))` 42 – Dustin Getz Jan 18 '19 at 21:42
25

I tried several methods mentioned in this thread (JDK 8 and Clojure 1.7), and did some benchmark tests:

repl> (defn find-first
         [f coll]
         (first (filter f coll)))
#'cenx.parker.strategies.vzw.repl/find-first

repl> (time (find-first #(= % 50000000) (range)))
"Elapsed time: 5799.41122 msecs"
50000000

repl> (time (some #{50000000} (range)))
"Elapsed time: 4386.256124 msecs"
50000000

repl> (time (reduce #(when (= %2 50000000) (reduced %2)) nil (range)))
"Elapsed time: 993.267553 msecs"
50000000

The results show that reduce way may be the most efficient solution as in clojure 1.7.

象嘉道
  • 3,657
  • 5
  • 33
  • 49
  • 2
    Very interesting. Thanks for testing these, xando. If would be better to have numbers from [Criterium](https://github.com/hugoduncan/criterium), but my guess is that by searching for an item that's so far along in the sequence, the JVM has had a chance to optimize the code. – Mars Dec 31 '16 at 21:01
15

In 2016 there was a patch submitted to clojure core that added an efficient shortcut for (first (filter pred coll)) idiom, it was called seek.

The implementation avoided problems in herent with both the (first (filter)) and (some #(when (pred))) alternatives. That is, it works efficiently with chunked sequences and plays nice with nil? and false? predicates.

Patch:

(defn seek
  "Returns first item from coll for which (pred item) returns true.
   Returns nil if no such item is present, or the not-found value if supplied."
  {:added  "1.9" ; note, this was never accepted into clojure core
   :static true}
  ([pred coll] (seek pred coll nil))
  ([pred coll not-found]
   (reduce (fn [_ x]
             (if (pred x)
               (reduced x)
               not-found))
           not-found coll)))

Examples:

(seek odd? (range)) => 1
(seek pos? [-1 1]) => 1
(seek pos? [-1 -2] ::not-found) => ::not-found
(seek nil? [1 2 nil 3] ::not-found) => nil

Eventually the patch was rejected:

Upon review, we've decided that we do not wish to include this. Use of linear search (and in particular nested linear search) leads to poor performance - often it's better to use other kinds of data structures and that's why this functionality has not been included in the past. ~Alex Miller 12/May/17 3:34 PM

Casey
  • 6,166
  • 3
  • 35
  • 42
  • 1
    According to the Jira ticket, the patch was declined. They do not want encourage people to use inefficient linear search. Makes sense to me :) https://clojure.atlassian.net/browse/CLJ-2056 – Andrei Galatyn Feb 09 '21 at 13:58
12

I think some is the best tool for the job:

(some #(if (= % 1) %) '(3 4 1))
deprecated
  • 5,142
  • 3
  • 41
  • 62
  • 5
    This does not suffer from the chunked sequences effect. But doesn't work in cases where `1` is `nil` or `false`. YMMV. – kotarak Apr 17 '12 at 14:35
  • Wait... how can 1 equal nil? :) anyway, Matthew specified that elements should 'evaluate to true' so I think the behavior is the desired. – deprecated Apr 17 '12 at 15:07
  • 3
    Consider `f` to be `#(contains? #{0 "false" false} %)`. And he said "evaluates a fn to true". – kotarak Apr 17 '12 at 15:11
7

Using drop-while instead of filter should address "over-application" of f for chunked sequences:

(defn find-first [f coll]
  (first (drop-while (complement f) coll)))
;;=> #'user/find-first

(find-first #(= % 1) [3 4 1])
;;=> 1
Dimagog
  • 1,785
  • 19
  • 14
  • Cool solution :). I tested to ensure laziness. ```(find-first #(do (println %) (even? %)) [1 3 2 5 7 9])```, returns 2 and doesn't print 5, 7, or 9. – Joel M May 21 '20 at 13:28
0

The way I do this in clojure is sort like you might do it in Scheme.

(defn call-with-found
  "Call the given predicate, pred, on successive elements of the collection
  until the first time pred returns a truthy value, at which time if-found
  is called with that element of the collection, and call-with-found returns
  the return value of if-found.   If no such element of collection is found
  (including if collection is empty) then the value if-not-found (defaulting
  to false) is returned."
  ([pred coll & {:keys [if-found if-not-found]
                 :or {if-found (constantly true)
                      if-not-found false}}]
   (reduce (fn [_ item]
             (if (pred item)
               (reduced (if-found item))
               if-not-found)) if-not-found coll)))

The function call-with-found is called with a predicate and a collection. We search the collection until we find an element which satisfies the predicate, at which point we call the if-found continuation with that value, else we return the if-not-found value.

Jim Newton
  • 594
  • 3
  • 16