3

Suppose I have a custom subscriber that requests one value on subscription and then an additional value three seconds after it receives the previous value:

class MySubscriber: Subscriber {
    typealias Input = Int
    typealias Failure = Never

    private var subscription: Subscription?

    func receive(subscription: Subscription) {
        print("Subscribed")

        self.subscription = subscription
        subscription.request(.max(1))
    }

    func receive(_ input: Int) -> Subscribers.Demand {
        print("Value: \(input)")

        DispatchQueue.main.asyncAfter(deadline: .now() + .seconds(3)) {
            self.subscription?.request(.max(1))
        }

        return .none
    }

    func receive(completion: Subscribers.Completion<Never>) {
        print("Complete")
        subscription = nil
    }
}

If I use this to subscribe to an infinite range publisher, back pressure is handled gracefully, with the publisher waiting 3 seconds each time until it receives the next demand to send a value:

(1...).publisher.subscribe(MySubscriber())

// Prints values infinitely with ~3 seconds between each:
//
//     Subscribed
//     Value: 1
//     Value: 2
//     Value: 3
//     ...

But if I add a map operator then MySubscriber never even receives a subscription; map appears to have synchronously requested Demand.Unlimited upon receiving its subscription and the app infinitely spins as map tries to exhaust the infinite range:

(1...).publisher
    .map { value in
        print("Map: \(value)")
        return value * 2
    }
    .subscribe(MySubscriber())

// The `map` transform is executed infinitely with no delay:
//
//     Map: 1
//     Map: 2
//     Map: 3
//     ...

My question is, why does map behave this way? I would have expected map to just pass its downstream demand to the upstream. Since map is supposed to be for transformation rather than side effects, I don't understand what the use case is for its current behavior.

EDIT

I implemented a version of map to show how I think it ought to work:

extension Publishers {
    struct MapLazily<Upstream: Publisher, Output>: Publisher {
        typealias Failure = Upstream.Failure

        let upstream: Upstream
        let transform: (Upstream.Output) -> Output

        init(upstream: Upstream, transform: @escaping (Upstream.Output) -> Output) {
            self.upstream = upstream
            self.transform = transform
        }

        public func receive<S: Subscriber>(subscriber: S) where S.Input == Output, S.Failure == Upstream.Failure {
            let mapSubscriber = Subscribers.LazyMapSubscriber(downstream: subscriber, transform: transform)
            upstream.receive(subscriber: mapSubscriber)
        }
    }
}

extension Subscribers {
    class LazyMapSubscriber<Input, DownstreamSubscriber: Subscriber>: Subscriber {
        let downstream: DownstreamSubscriber
        let transform: (Input) -> DownstreamSubscriber.Input

        init(downstream: DownstreamSubscriber, transform: @escaping (Input) -> DownstreamSubscriber.Input) {
            self.downstream = downstream
            self.transform = transform
        }

        func receive(subscription: Subscription) {
            downstream.receive(subscription: subscription)
        }

        func receive(_ input: Input) -> Subscribers.Demand {
            downstream.receive(transform(input))
        }

        func receive(completion: Subscribers.Completion<DownstreamSubscriber.Failure>) {
            downstream.receive(completion: completion)
        }
    }
}

extension Publisher {
    func mapLazily<Transformed>(transform: @escaping (Output) -> Transformed) -> AnyPublisher<Transformed, Failure> {
        Publishers.MapLazily(upstream: self, transform: transform).eraseToAnyPublisher()
    }
}

Using this operator, MySubscriber receives the subscription immediately and the mapLazily transform is only executed when there is demand:

(1...).publisher
    .mapLazily { value in
        print("Map: \(value)")
        return value * 2
    }
    .subscribe(MySubscriber())

// Only transforms the values when they are demanded by the downstream subscriber every 3 seconds:
//
//     Subscribed
//     Map: 1
//     Value: 2
//     Map: 2
//     Value: 4
//     Map: 3
//     Value: 6
//     Map: 4
//     Value: 8

My guess is that the particular overload of map defined for Publishers.Sequence is using some kind of shortcut to enhance performance. This breaks for infinite sequences, but even for finite sequences eagerly exhausting the sequence regardless of the downstream demand messes with my intuition. In my view, the following code:

(1...3).publisher
    .map { value in
        print("Map: \(value)")
        return value * 2
    }
    .subscribe(MySubscriber())

ought to print:

Subscribed
Map: 1
Value: 2
Map: 2
Value: 4
Map: 3
Value: 6
Complete

but instead prints:

Map: 1
Map: 2
Map: 3
Subscribed
Value: 2
Value: 4
Value: 6
Complete
jjoelson
  • 5,771
  • 5
  • 31
  • 51
  • 1
    What Xcode and what target platform are you using? Using Xcode 11.4 on macOS 10.15.4, I can't reproduce your problem. With your `MySubscriber`, the following pipeline behaves correctly: `(1...3).publisher .print("Empty/map") .map { dump($0) } .print("map/subscribe") .subscribe(MySubscriber()) ` – rob mayoff Apr 10 '20 at 17:34
  • Oh sorry, I just realized I copied the wrong code. I'm trying to use an infinite range `1...` rather than `1...3` (I made it finite at one point for testing). I just updated the question to fix. To answer your question, I'm still on 10.14.6 so Xcode 11.3.1. – jjoelson Apr 10 '20 at 18:16
  • I just tested on my personal MacBook running macOS 10.15.2 and Xcode 11.4, and I'm seeing the same behavior. – jjoelson Apr 10 '20 at 18:21
  • I set out to write a counterexample by testing in a different way, and ended up confirming your results! – matt Apr 11 '20 at 19:53

1 Answers1

4

Here's a simpler test that doesn't involve any custom subscribers:

(1...).publisher
    //.map { $0 }
    .flatMap(maxPublishers: .max(1)) {
        (i:Int) -> AnyPublisher<Int,Never> in
        Just<Int>(i)
            .delay(for: 3, scheduler: DispatchQueue.main)
            .eraseToAnyPublisher()
}
.sink { print($0) }
.store(in: &storage)

It works as expected, but then if you uncomment the .map you get nothing, because the .map operator is accumulating the infinite upstream values without publishing anything.

On the basis of your hypothesis that map is somehow optimizing for a preceding sequence publisher, I tried this workaround:

(1...).publisher.eraseToAnyPublisher()
    .map { $0 }
    // ...

And sure enough, it fixed the problem! By hiding the sequence publisher from the map operator, we prevent the optimization.

matt
  • 515,959
  • 87
  • 875
  • 1,141
  • I'd say it's a nice clean bug and you should report it. :) – matt Apr 11 '20 at 20:01
  • 1
    Good example! It seems like inserting `map { $0 }` anywhere in a combine pipeline ought to be a no-op. I also did some more testing with finite sequences and `map` is indeed always eagerly exhausting them as far as I can tell, which means if the transformation is expensive and/or the sequence is long then you are doing all of that work up front rather than as needed. I will report a bug! – jjoelson Apr 11 '20 at 20:06
  • 2
    OK, submitted: `FB7660502`. I hope you don't mind I used your simpler example in the report. – jjoelson Apr 11 '20 at 20:45
  • 1
    On the contrary, I'm very happy about that. – matt Apr 11 '20 at 21:03
  • 1
    Hey I thought of a trivial workaround. Added it to the answer. It confirms your hypothesis. – matt Apr 11 '20 at 21:37
  • Good answer. In fact `Publishers.Sequence` has custom implementations of a whole bunch of operators, many of which might demonstrate the same problem. Search [the documentation](https://developer.apple.com/documentation/combine/publishers/sequence) for `-> Publishers.Sequence`. – rob mayoff Apr 11 '20 at 22:21
  • 4
    Also, here's a really short test case that hangs: `(1...).publisher.map { $0 }.prefix(1).sink { _ in }` – rob mayoff Apr 11 '20 at 22:25
  • 1
    @robmayoff I'm thinking that maybe Publishers.Sequence is a toy meant only for testing. – matt Apr 11 '20 at 22:54
  • 1
    @matt I don't imagine they would have bothered creating custom `Publishers.Sequence` implementations of these various operators if it was only intended for testing. – jjoelson Apr 13 '20 at 13:20
  • Hmmm, `Publishers.Zip` also has this issue, not just the `map` operator. Here's a broken test case: https://repl.it/@alexandermomchilov/Combine-Broken-Map-Operator#main.swift (you have to run it locally, repl.it doesn't have access to Combine). Removing the `prefix(10)` makes Combine try to materialize the infinite sequence (`(1...)`) before processing any steps downstream in the chain. – Alexander Dec 23 '20 at 01:08
  • @Alexander Yeah, it's all about Sequence publisher (not any one operator). – matt Dec 23 '20 at 01:31
  • @matt Is there another work around? Erasing the type with `.eraseToAnyPublisher()` didn't seem to help – Alexander Dec 23 '20 at 01:40
  • Well, I don't understand what you are trying to do. The example is not very apt, since it is easy to write a publisher that emits the numbers 1... one per second. – matt Dec 23 '20 at 01:44
  • I know I can use the `TimerPublisher` directly, `map`ping over it with a `(Date) -> Int`, but then that closure would need to have a side effect (to increment the counter with every iteration). Is there some other way I'm missing? – Alexander Dec 23 '20 at 01:54
  • @Alexander I don't know what you mean by a side effect, and I would just use a `scan`. – matt Dec 23 '20 at 02:02
  • @matt Ahhhh yeah, scan fits the bill perfectly. I was thinking something like `timer.map { _ in counter += 1; return counter }`, where counter is a captured local variable which is mutated on each iteration of map (that was the side effect I was referring it, and it's gross) – Alexander Dec 23 '20 at 02:50
  • `scan` can also act like a publisher-variant of `sequence(first:next:)`, right? Then what's the equivalent of `sequence(state:next:)`? https://swiftdoc.org/v5.1/func/sequence/ – Alexander Dec 23 '20 at 02:52
  • @Alexander are we not rather off topic? – matt Dec 23 '20 at 03:20
  • I suppose. I could ask a new question but it doesn’t really hit the bar for novelty – Alexander Dec 23 '20 at 04:27