15

I have a list l:List[T1] and currently im doing the following:

myfun : T1 -> Option[T2]
val x: Option[T2] = l.map{ myfun(l) }.flatten.find(_=>true)

The myfun function returns None or Some, flatten throws away all the None's and find returns the first element of the list if any.

This seems a bit hacky to me. Im thinking that there might exist some for-comprehension or similar that will do this a bit less wasteful or more clever. For example: I dont need any subsequent answers if myfun returns any Some during the map of the list l.

giampaolo
  • 6,906
  • 5
  • 45
  • 73
svrist
  • 7,042
  • 7
  • 44
  • 67

5 Answers5

13

How about:

l.toStream flatMap (myfun andThen (_.toList)) headOption

Stream is lazy, so it won't map everything in advance, but it won't remap things either. Instead of flattening things, convert Option to List so that flatMap can be used.

Daniel C. Sobral
  • 295,120
  • 86
  • 501
  • 681
  • 1
    If I'm not confused by myself, "flatMap" can be used on Option as well, so I think "andThen (_.toList)" is superfluous – Jens Schauder Apr 24 '11 at 13:26
  • 1
    val l = List(1, 2, 3, 4, 5, 6) def fun(i : Int) = { if (i == 3) Some(3) else None } println(l.flatMap(fun(_)).head) println(l.flatMap(fun(_)).headOption) – Jens Schauder Apr 25 '11 at 12:56
  • results in 3 and Some(3) so it does work, or am I messing something up? – Jens Schauder Apr 25 '11 at 12:57
  • @Jens That `fun` is a `def`, not a function `T => Option[T]`. – Daniel C. Sobral Apr 25 '11 at 13:08
  • Starting to feeling kinda dumb here .. what do you mean with its a def? It is a function from Int to Option[Int]. I can write it as val fun : (Int => Option[Int]) = (i : Int) => { if (i == 3) Some(3) else None } – Jens Schauder Apr 25 '11 at 14:12
  • @Jens In your example, `fun` was a method that took an `Int` and returned an `Option[Int]`. That is different than a method (or a value) that returns a function of type `Int => Option[Int]`. For a method, `l.flatMap(fun).headOption` will work. If `fun` is a function value, however, it won't. I suspect you tested it with `fun(_)` instead of just `fun` which resulted in Scala doing a partial function application, effectively creating a new function. In that context, the type inference results in implicits conversions being applied. – Daniel C. Sobral Apr 25 '11 at 15:20
  • The more I use Scala, the more I'm in love with it. Thanks :) – Lev Sivashov Jan 06 '16 at 11:09
4

In addition to using toStream to make the search lazy, we can use Stream::collectFirst:

List(1, 2, 3, 4, 5, 6, 7, 8).toStream.map(myfun).collectFirst { case Some(d) => d }
// Option[String] = Some(hello)
// given def function(i: Int): Option[String] = if (i == 5) Some("hello") else None

This:

  • Transforms the List into a Stream in order to stop the search early.

  • Transforms elements using myFun as Option[T]s.

  • Collects the first mapped element which is not None and extract it.

Starting Scala 2.13, with the deprecation of Streams in favor of LazyLists, this would become:

List(1, 2, 3, 4, 5, 6, 7, 8).to(LazyList).map(function).collectFirst { case Some(d) => d }
Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
  • 2
    IMHO this solution is the only one that makes it clear to a casual reader that we are searching for the first Some. In the other solutions that use flatMap with Option.toList, the Option -> List conversion and flatMap obfuscates the original intent: "find the first Some" – tksfz Feb 19 '19 at 05:24
  • This is how it is solved on this [issue](https://stackoverflow.com/a/29234550/5826349) as well – Valy Dia Mar 01 '19 at 16:57
  • You don't need to convert to a `Stream` for this - List also has a `collectFirst` method and you can convert the function itself into a `PartialFunction` with `Function.unlift`. – Robin Green Sep 12 '19 at 10:42
  • @RobinGreen If the `collectFirst` matches the first element in the list, then without a lazy sequence (stream/iterator/lazylist), you would wastefully be applying the `function` mapping on all elements in the list even though you only really need to apply it for that first element. Also, by applying the `function` operation within the lifting part of the `collectFirst` you'd also have to apply it a second time within the mapping part of the `collectFirst` which might be wasteful if `function` is expensive. – Xavier Guihot Sep 12 '19 at 12:45
3

Well, this is almost, but not quite

val x = (l flatMap myfun).headOption

But you are returning a Option rather than a List from myfun, so this may not work. If so (I've no REPL to hand) then try instead:

val x = (l flatMap(myfun(_).toList)).headOption
oxbow_lakes
  • 133,303
  • 56
  • 317
  • 449
2

Well, the for-comprehension equivalent is pretty easy

(for(x<-l, y<-myfun(x)) yield y).headOption

which, if you actually do the the translation works out the same as what oxbow_lakes gave. Assuming reasonable laziness of List.flatmap, this is both a clean and efficient solution.

Dave Griffith
  • 20,435
  • 3
  • 55
  • 76
  • 2
    Unfortunately, List (i.e. collection.immutable.List) does not have lazy operations. For reasons I don't understand, replacing `l` with `l.view` results in myfun being evaluated multiple times with the same arguments. – Aaron Novstrup Sep 04 '10 at 22:28
  • 8
    view is call-by-name, not lazy. If you want at-most-once evaluation, use toStream: (l.toStream flatMap myfun).headOption – Martin Odersky Sep 05 '10 at 09:03
0

As of 2017, the previous answers seem to be outdated. I ran some benchmarks (list of 10 million Ints, first match roughly in the middle, Scala 2.12.3, Java 1.8.0, 1.8 GHz Intel Core i5). Unless otherwise noted, list and map have the following types:

list: scala.collection.immutable.List
map: A => Option[B]

Simply call map on the list: ~1000 ms

list.map(map).find(_.isDefined).flatten

First call toStream on the list: ~1200 ms

list.toStream.map(map).find(_.isDefined).flatten

Call toStream.flatMap on the list: ~450 ms

list.toStream.flatMap(map(_).toList).headOption

Call flatMap on the list: ~100 ms

list.flatMap(map(_).toList).headOption

First call iterator on the list: ~35 ms

list.iterator.map(map).find(_.isDefined).flatten

Recursive function find(): ~25 ms

def find[A,B](list: scala.collection.immutable.List[A], map: A => Option[B]) : Option[B] = {
  list match {
    case Nil => None
    case head::tail => map(head) match {
      case None => find(tail, map)
      case result @ Some(_) => result
    }
  }
}

Iterative function find(): ~25 ms

def find[A,B](list: scala.collection.immutable.List[A], map: A => Option[B]) : Option[B] = {
  for (elem <- list) {
    val result = map(elem)
    if (result.isDefined) return result
  }
  return None
}

You can further speed up things by using Java instead of Scala collections and a less functional style.

Loop over indices in java.util.ArrayList: ~15 ms

def find[A,B](list: java.util.ArrayList[A], map: A => Option[B]) : Option[B] = {
  var i = 0
  while (i < list.size()) {
    val result = map(list.get(i))
    if (result.isDefined) return result
    i += 1
  }
  return None
}

Loop over indices in java.util.ArrayList with function returning null instead of None: ~10 ms

def find[A,B](list: java.util.ArrayList[A], map: A => B) : Option[B] = {
  var i = 0
  while (i < list.size()) {
    val result = map(list.get(i))
    if (result != null) return Some(result)
    i += 1
  }
  return None
}

(Of course, one would usually declare the parameter type as java.util.List, not java.util.ArrayList. I chose the latter here because it's the class I used for the benchmarks. Other implementations of java.util.List will show different performance - most will be worse.)

  • Profiling the JVM is [notoriously problematic](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java). What benchmark tools did you use? [Thyme](https://github.com/Ichoran/thyme)? JMH? Other? – jwvh Dec 23 '17 at 22:16
  • I know about the problems with Java benchmarks. I didn't use any tools, just System.nanoTime(). Each of these numbers is the average of 100 runs: start JVM, fill list with 10 million random ints, start the clock, run find() 100 times, stop the clock. Not very precise, but since there are differences of several orders of magnitude, I'd say this simple benchmark gives at least a useful rough overview of the relative performance of these approaches. – jcsahnwaldt Reinstate Monica Dec 24 '17 at 00:50
  • I'm travelling right now, but I hope I'll have time to upload the (very simple) code I used in the next few days. I'd love to see the same benchmarks measured by a proper tool! – jcsahnwaldt Reinstate Monica Dec 24 '17 at 00:52
  • By the way, some had large speed variations due to GC (e.g. toStream), others had almost no variation (e.g. iterator.map). I'd bet a more fine-grained benchmark would reproduce these numbers to within ±20% (for the ones with low variation) to ±50% (for the ones with high GC overhead). – jcsahnwaldt Reinstate Monica Dec 24 '17 at 00:54
  • P.S.: I used the same random seed (and thus the same 10 million random numbers) for all benchmarks. – jcsahnwaldt Reinstate Monica Dec 28 '17 at 21:41