Why lazy.compactMap.first maps 'first' element twice?

Question

I'm testing compactMap for lazy array to find first element and map it in a few lines of code.

"abc5def".lazy
  .compactMap {
    print($0)
    return Int(String($0))
}.first as Int?

Prints

a
b
c
5
5

Why last element being mapped twice. How to avoid this behaviour?

Related: https://stackoverflow.com/questions/56782134/compactmap-on-sequence-not-lazy — Martin R, Jan 17 '20 at 11:46
@MartinR Is this because `first` is a property while `first(where:)` is a method? — Joakim Danielson, Jan 17 '20 at 12:03
Also related (if not a duplicate): https://stackoverflow.com/questions/41940243/why-does-filter-s-predicate-get-called-so-many-times-when-evaluating-it-lazi. — Martin R, Jan 17 '20 at 12:13
Thanks! `.first(where: { _ in true })` works better. I think i need to check swift's source code to find why this happend. @JoakimDanielson `.first` has to be a getter — Dmitry Kozlov, Jan 17 '20 at 12:18
Does this answer your question? [compactMap on sequence() not lazy?](https://stackoverflow.com/questions/56782134/compactmap-on-sequence-not-lazy) — Dmitry Kozlov, Jan 17 '20 at 12:22
What is interesting – if you put chain `lazy.compactMap(...).first(where: { _ in true })` in a separate Collection's extension method – it stops working. — Vladlex, Apr 14 '21 at 16:34

Cristik · Answer 1 · 2022-05-05T18:10:22.400

TL;DR The compactMap call returns a chain of lazy sequences LazyMapSequence<LazyFilterSequence<LazyMapSequence<..., this, combined with the fact that first needs to compute both the start index, as well as the element at that start index, results in the transform closure being called twice:

when startIndex is computed
when retrieving the element at the start index

This is the current implementation of compactMap over LazySequenceProtocol (a protocol that all lazy sequences conform to):

public func compactMap<ElementOfResult>(
    _ transform: @escaping (Elements.Element) -> ElementOfResult?
  ) -> LazyMapSequence<
    LazyFilterSequence<
      LazyMapSequence<Elements, ElementOfResult?>>,
    ElementOfResult
  > {
    return self.map(transform).filter { $0 != nil }.map { $0! }
}

This makes your "abc5def".lazy.compactMap { ... } to be of type LazyMapSequence<LazyFilterSequence<LazyMapSequence<String, Optional<Int>>>, Int>.

Secondly, you're asking about the first element from the lazy sequence. This resolves to the default implementation of first over the Collection protocol (all lazy sequences get automatic conformance to Collection if their base sequence is also a collection):

public var first: Element? {
    let start = startIndex
    if start != endIndex { return self[start] }
    else { return nil }
}

This means that first has to retrieve two pieces of information:

the start index
the value at the start index (the subscript part)

Now, it's the startIndex computation that causes the duplicate evaluation, due to this implementation over LazyFilterSequence:

public var startIndex: Index {
    var index = _base.startIndex
    while index != _base.endIndex && !_predicate(_base[index]) {
      _base.formIndex(after: &index)
    }
    return index
}

The subscript implementation over LazyMapSequence is a standard one:

public subscript(position: Base.Index) -> Element {
    return _transform(_base[position])
}

, however, as you can see, the transform is called again, resulting in the second print you see.

Why lazy.compactMap.first maps 'first' element twice?

1 Answers1