Scala - access collection members within map or flatMap

Question

Suppose that I use a sequence of various maps and/or flatMaps to generate a sequence of collections. Is it possible to access information about the "current" collection from within any of those methods? For example, without knowing anything specific about the functions used in the previous maps or flatMaps, and without using any intermediate declarations, how can I get the maximum value (or length, or first element, etc.) of the collection upon which the last map acts?

List(1, 2, 3)
  .flatMap(x => f(x) /* some unknown function */)
  .map(x => x + ??? /* what is the max element of the collection? */)

Edit for clarification:

In the example, I'm not looking for the max (or whatever) of the initial List. I'm looking for the max of the collection after the flatMap has been applied.
By "without using any intermediate declarations" I mean that I do not want to use any temporary collections en route to the final result. So, the example by Steve Waldman below, while giving the desired result, is not what I am seeking. (I include this condition is mostly for aesthetic reasons.)

Edit for clarification, part 2:

The ideal solution would be some magic keyword or syntactic sugar that lets me reference the current collection:

List(1, 2, 3)
  .flatMap(x => f(x))
  .map(x => x + theCurrentList.max)

I'm prepared to accept the fact, however, that this simply is not possible.

"max element" of which `List`? The original `List(1,2,3)`, or the output from the `flatMap()`, or the output from the `map()`? — jwvh, Jun 13 '17 at 22:50
Simply not possible (or at least not possible simply). Think about it. The argument to a `map()` (or `flatMap()`) is a function that takes a parameter (an element from the current collection) and produces an output. The function is invoked once for each element of the collection. There is no way for the function to "know" how many times it will be invoked or what the next argument will be when it is invoked. — jwvh, Jun 14 '17 at 16:50
I don't think what you are trying to do, as you worded it, is possible. That said, if you could let me know what it is you are trying to accomplish exactly, I may be able to come up with something. As I showed, it's possible to do it all as a single expression, but it's inelegant. But if you are simply trying to limit exposed `val`s, there are ways to accomplish that. — Jack Leow, Jun 14 '17 at 19:18

Steve Waldman · Answer 1 · 2017-06-13T23:07:56.027

2

Maybe just define the list as a val, so you can name it? I don't know of any facility built into map(...) or flatMap(...) that would help.

val myList = List(1, 2, 3)
myList
  .flatMap(x => f(x) /* some unknown function */)
  .map(x => x + myList.max /* what is the max element of the List? */)

Update: By this approach at least, if you have multiple transformations and want to see the transformed version, you'd have to name that. You could get away with

val myList = List(1, 2, 3).flatMap(x => f(x) /* some unknown function */)

myList.map(x => x + myList.max /* what is the max element of the List? */)

Or, if there will be multiple transformations, get in the habit of naming the stages.

val rawList    = List(1, 2, 3)
val smordified = rawList.flatMap(x => f(x) /* some unknown function */)
val maxified   = smordified.map(x => x + smordified.max /* what is the max element of the List? */)
maxified

Update 2: Watch it work in the REPL even with heterogenous types:

scala> def f( x : Int ) : Vector[Double] = Vector(x * math.random, x * math.random )
f: (x: Int)Vector[Double]

scala> val rawList    = List(1, 2, 3)
rawList: List[Int] = List(1, 2, 3)

scala> val smordified = rawList.flatMap(x => f(x) /* some unknown function */)
smordified: List[Double] = List(0.40730853571901315, 0.15151641399798665, 1.5305929709857609, 0.35211231420067435, 0.644241939254793, 0.15530230501048903)

scala> val maxified   = smordified.map(x => x + smordified.max /* what is the max element of the List? */)
maxified: List[Double] = List(1.937901506704774, 1.6821093849837476, 3.0611859419715217, 1.8827052851864352, 2.1748349102405538, 1.6858952759962498)

scala> maxified
res3: List[Double] = List(1.937901506704774, 1.6821093849837476, 3.0611859419715217, 1.8827052851864352, 2.1748349102405538, 1.6858952759962498)

edited Jun 13 '17 at 23:07

answered Jun 13 '17 at 22:45

Steve Waldman

13,689
1
35
45

That wouldn't be the max of the current List. Is it? – juanpavergara Jun 13 '17 at 22:49
Good point. No, it would be the original list. If you want post-transformations, you'd want to name the transformed version. I'll add an update like that. – Steve Waldman Jun 13 '17 at 22:51
The evaluation of `f(x)` would have to be of type `GenTraversableOnce` so the solution would get complicated because `max` had to exist for that type :/ I think the question makes sense only with a series of transformations with `map` and no `flatmap` – juanpavergara Jun 13 '17 at 22:56
The scala compiler is really smart! It'll work, as long as there's a plus operator and an ordering (so that the `max` operation) defined for the type parameter of the collection generated by `f(x)`. – Steve Waldman Jun 13 '17 at 23:03
1

I've updated again with an example, transforming in `flatMap(...)` from `List[Int]` to `Vector[Double]`. – Steve Waldman Jun 13 '17 at 23:08
Steve, this technique does give the desired result, but unfortunately not in the style that I am seeking; see my edit. – mike w Jun 14 '17 at 04:49
fair enough! if there's a way to do it in the style you prefer, i'd like to know it too. – Steve Waldman Jun 14 '17 at 04:51

Jack Leow · Answer 2 · 2017-06-14T06:33:20.903

It is possible, but not pretty, and not likely something you want if you are doing it for "aesthetic reasons."

import scala.math.max

def f(x: Int): Seq[Int] = ???

List(1, 2, 3).
  flatMap(x => f(x) /* some unknown function */).
  foldRight((List[Int](),List[Int]())) {
    case (x, (xs, Nil)) => ((x :: xs), List.fill(xs.size + 1)(x))
    case (x, (xs, xMax :: _)) => ((x :: xs), List.fill(xs.size + 1)(max(x, xMax)))
  }.
  zipped.
  map {
    case (x, xMax) => x + xMax
  }

// Or alternately, a slightly more efficient version using Streams.
List(1, 2, 3).
  flatMap(x => f(x) /* some unknown function */).
  foldRight((List[Int](),Stream[Int]())) {
    case (x, (xs, Stream())) =>
      ((x :: xs), Stream.continually(x))
    case (x, (xs, curXMax #:: _)) =>
      val newXMax = max(x, curXMax)
      ((x :: xs), Stream.continually(newXMax))
  }.
  zipped.
  map {
    case (x, xMax) => x + xMax
  }

Seriously though, I just took this on to see if I could do it. While the code didn't turn out as bad as I expected, I still don't think it's particularly readable. I'd discourage using this over something similar to Steve Waldman's answer. Sometimes, it's simply better to just introduce a val, rather than being dogmatic about it.

This is a very interesting solution, but as you say, it does not seem to have any advantages over the other solution. In particular, it does not seem very flexible: e.g., if I had wanted the sum of the squares of all elements of the list instead of the max element, it seems like it would require significant code changes. — mike w, Jun 14 '17 at 16:35

score 0 · Answer 3 · edited Apr 08 '22 at 08:07

One somewhat-simple way of referencing prior output within the current map/collect operation is to use a named reference outside the map, then reference it from within the map block:

var prevOutput = ...  // starting value of whatever is referenced within the map
myValues.map {
  prevOutput = ... // expression that references prior `prevOutput`
  prevOutput       // return above computed value for the map to collect
}

This draws attention to the fact that we're referencing prior elements while building the new sequence.

This would be more messy, though, if you wanted to reference arbitrarily previous values, not just the previous one.

score 0 · Accepted Answer · answered Mar 06 '22 at 21:32

You could define a mapWithSelf (resp. flatMapWithSelf) operation along these lines and add it as an implicit enrichment to the collection. For List it might look like:

// Scala 2.13 APIs
object Enrichments {
  implicit class WithSelfOps[A](val lst: List[A]) extends AnyVal {
    def mapWithSelf[B](f: (A, List[A]) => B): List[B] =
      lst.map(f(_, lst))

    def flatMapWithSelf[B](f: (A, List[A]) => IterableOnce[B]): List[B] =
      lst.flatMap(f(_, lst))
  }
}

The enrichment basically fixes the value of the collection before the operation and threads it through. It should be possible to generify this (at least for the strict collections), though it would look a little different in 2.12 vs. 2.13+.

Usage would look like

import Enrichments._

val someF: Int => IterableOnce[Int] = ???

List(1, 2, 3)
  .flatMap(someF)
  .mapWithSelf { (x, lst) =>
    x + lst.max
  }

So at the usage site, it's aesthetically pleasant. Note that if you're computing something which traverses the list, you'll be traversing the list every time (leading to a quadratic runtime). You can get around that with some mutability or by just saving the intermediate list after the flatMap.

Scala - access collection members within map or flatMap

4 Answers4