0

So I want to collect values until I see the last page, but if the last page never comes I want to send what we have, given a time limit.

I have a way of doing this but it seems rather wasteful. I'm going to be using this to make collections that may have hundreds of thousands of values, so a more space-efficient method would be preferred.

You can copy and paste the following into a playground

import UIKit
import Foundation
import Combine

var subj = PassthroughSubject<String, Never>()

let queue = DispatchQueue(label: "Test")

let collectForTime = subj
    .collect(.byTime(queue, .seconds(10)))

let collectUntilLast = subj
  .scan([String]()) { $0 + [$1] }
  .first { $0.last == "LastPage" }

// whichever happens first
let cancel = collectForTime.merge(with: collectUntilLast)
    .first()
    .sink {
        print("complete1: \($0)")
    } receiveValue: {
        print("received1: \($0)")
    }

print("start")

let strings = [
    "!@#$",
    "ZXCV",
    "LastPage", // comment this line to test to see what happens if no last page is sent
    "ASDF",
    "JKL:"
]
// if the last page is present then the items 0..<3 will be sent
// if there's no last page then send what we have
// the main thing is that the system is not just sitting there waiting for a last page that never comes.

for i in (0..<strings.count) {
    DispatchQueue.main.asyncAfter(deadline: .now() + .seconds(i)) {
        let s = strings[i]
        print("sending \(s)")
        subj.send(s)
    }
}
Yogurt
  • 2,913
  • 2
  • 32
  • 63

2 Answers2

1

UPDATE

After playing a bit more with the playground I think all you need is:

subj
  .prefix { $0 != "LastPage" }
  .append("LastPage")
  .collect(.byTime(DispatchQueue.main, .seconds(10)))

I wouldn't use collect because under the hood it is basically doing the same thing that scan is doing, you only need another condition in the first closure eg: .first { $0.last == "LastPage" || timedOut } to emit the collected items in case of timeout.

It's unfortunate that collect doesn't offer the API you need but we can create another version of it. The idea is to combineLatest the scan output with a stream that emits a Bool after a deadline (In reality we also need to emit false initially for combineLatest to start) and || this additional variable inside filter condition.

Here is the code:

extension Publisher {

  func collect<S: Scheduler>(
    timeoutAfter interval: S.SchedulerTimeType.Stride,
    scheduler: S,
    orWhere predicate: @escaping ([Output]) -> Bool
  ) -> AnyPublisher<[Output], Failure> {
    scan([Output]()) { $0 + [$1] }
      .combineLatest(
        Just(true)
          .delay(for: interval, scheduler: scheduler)
          .prepend(false)
          .setFailureType(to: Failure.self)
      )
      .first { predicate($0) || $1 }
      .map(\.0)
      .eraseToAnyPublisher()
  }

}

let subj = PassthroughSubject<String, Never>()

let cancel = subj
  .collect(
    timeoutAfter: .seconds(10),
    scheduler: DispatchQueue.main,
    orWhere: { $0.last == "LastPage" }
  )
  .print()
  .sink { _ in }
Fabio Felici
  • 2,841
  • 15
  • 21
  • After further testing, your answer worked flawlessly. But I need some more time to review – Yogurt Oct 18 '22 at 23:52
  • The update with the append last page won't work because the last page in my project is just an id attached to an item with many values. – Yogurt Oct 19 '22 at 00:02
1

I made a small change to your technique

import Foundation
import Combine

var subj = PassthroughSubject<String, Never>()

let lastOrTimeout = subj
    .timeout(.seconds(10), scheduler: RunLoop.main )
    .print("watchdog")
    .first { $0 == "LastPage" }
    .append(Just("Done"))

let cancel = subj
    .prefix(untilOutputFrom: lastOrTimeout)
    .print("main_publisher")
    .collect()
    .sink {
        print("complete1: \($0)")
    } receiveValue: {
        print("received1: \($0)")
    }

print("start")

let strings = [
    "!@#$",
    "ZXCV",
    "LastPage", // comment this line to test to see what happens if no last page is sent
    "ASDF",
    "JKL:"
]
// if the last page is present then the items 0..<3 will be sent
// if there's no last page then send what we have
// the main thing is that the system is not just sitting there waiting for a last page that never comes.

strings.enumerated().forEach { index, string in
    DispatchQueue.main.asyncAfter(deadline: .now() + .seconds(index)) {
        print("sending \(string)")
        subj.send(string)
    }
}

lastOrTimeout will emit a value when it see's LastPage or finishes because of a timeout (and emits Done).

The main pipeline collects values until the watchdog publisher emits a value and collects all the results.

Scott Thompson
  • 22,629
  • 4
  • 32
  • 34
  • I didn't know you can chain timeout and first like that. – Yogurt Oct 18 '22 at 22:18
  • I noticed that the timeout scenario doesn't work without the append(Just("Done")), why is that? – Yogurt Oct 18 '22 at 22:20
  • 1
    If you remove `"LastPage"` from the array and add infinite other elements, your solution will never complete because of `timeout` resetting at each emission. – Fabio Felici Oct 18 '22 at 22:31
  • @FabioFelici I was talking about the append not the first. – Yogurt Oct 18 '22 at 22:41
  • After messing around with this in playgrounds I see the necessity for the append. You need to send the "Done" in order to trigger the condition to stop prefixing. – Yogurt Oct 18 '22 at 22:42
  • 1
    @Biclops Sorry I was referring to the whole solution. `append` is needed because `timeout` only completes and doesn't send output so it won't work for `prefix(untilOutputFrom)` – Fabio Felici Oct 18 '22 at 22:43
  • @FabioFelici Your comments are valid, but it's not clear what the timeout requirements are. I suspect the solution might be to replace the `collect` in the main pipeline with something like the `.collect(.byTime(queue, .seconds(10)))` from the original post, and remove the `timeout` and `append` – Scott Thompson Oct 18 '22 at 23:00
  • @ScottThompson I tested your answer and it seems to time out a long time after it said it would. – Yogurt Oct 18 '22 at 23:33
  • @ScottThompson I also noticed an odd things when testing the `lastPage` scenario. If I removed the timeout and only had the first and append it doesn't always include the last page. I ran into this problem originally in a question I posted the other day. Where I would collect up the last page but not include it. In my tests I only see the last page getting added half of the times. – Yogurt Oct 18 '22 at 23:42