1

I have an asynchronous task that is repeated a few times to pull up various image URLs for some animals. The code goes something like this:

let animals = ['Cheetah', 'Lion', 'Elephant']
let image_urls: [String:[String]] = [:]

for animal in animals {
   var page = 0
   var urls = []
   while true {
       let res = try await fetchImageUrls(animal, page)
       if res.nextPage == nil {
          break
       }
       urls.append(contentsOf: res.imagesUrls)
       page = res.nextPage
   }
   image_urls[animal] = urls
}

The problem is there's a serial element to each task (need to load next page from call before) and then there's a parallel element (i can load animals in parallel)

When I do concurrentPerform

// .. same setup ..

DispatchQueue.concurrentPerform(iterations: self.animals.count) { (i) in
     let animal = animals[i]

     // same code as above:
     var page = 0
     var urls = []
     while true {
         let res = try await fetchImageUrls(animal, page)
         // ...
     }
}

it fails on the await. because concurrentPerform needs to remain synchronous... not sure how to fix this problem... adding completion handlers to make the code synchronous is very messy. Not sure how to fix this cleanly really... i don't know why fetchImageUrls needs to be async but i also don't know enough to fix that

i feel lost here. any suggestions how to approach this?

JoeVictor
  • 1,806
  • 1
  • 17
  • 38
  • Completion handlers are probably your best bet, what about them makes the code messy? – JCode May 31 '22 at 16:40
  • Unrelated to the question at hand, but, this exits if `nextPage` is `nil` without adding the last fetched result. Does `nextPage` being `nil` really mean that the current result had no URLs, or that there was a result but that there is no “next page” (i.e. you don’t need to attempt to fetch the next set of URLs)? Also, I’d probably make a `AsyncSequence` for `fetchimageUrls`, but we don’t have enough here to be specific about that implementation. – Rob Jun 08 '22 at 05:00

1 Answers1

3

As you noted, concurrentPerform is designed to perform synchronous tasks in parallel. In Swift concurrency, rather than introducing concurrentPerform, if one wanted to perform a series of asynchronous tasks in parallel, one would reach for a task group. See Swift Programming Language: Concurrency: Tasks and Task Groups.

E.g., you might do something like:

func animalURLs() async throws -> [String: [String]] {
    try await withThrowingTaskGroup(of: (String, [String]).self) { group in
        let animals = ["Cheetah", "Lion", "Elephant"]
   
        for animal in animals {
            group.addTask {
                var page = 0
                var urls: [String] = []
                while true {
                    let res = try await self.fetchImageUrls(animal, page)
                    guard let nextPage = res.nextPage else {
                        return (animal, urls)
                    }
                    urls.append(contentsOf: res.imagesUrls)
                    page = nextPage
                }
            }
        }

        return try await group.reduce(into: [:]) { $0[$1.0] = $1.1 }

        // that `try await group.reduce` is equivalent to:
        //
        // var imageURLs: [String:[String]] = [:]
        //
        // for try await result in group {
        //     imageURLs[result.0] = result.1
        // }
        //
        // return imageURLs
    }
}

unfortunately this [withThrowingTaskGroup] doesn't perform the tasks in parallel like concurrentPerform, merely concurrently. so u don't speed up processing by utilising multiple cpu's.

While Swift concurrency is, fundamentally, a concurrency system (with await suspension points, continuations, etc.), task groups do offer parallelism.

Consider a very different example (with the OSLog kruft removed):

func testAsyncAwait() async {
    await withTaskGroup(of: Void.self) { group in
        for i in 0 ..< iterations {
            group.addTask { [self] in
                let pi = calculatePi(iteration: i, decimalPlaces: digits)  // some random, synchronous, computationally intensive calculation
            }
        }
        await group.waitForAll()
    }
}

In Instruments, we can see the parallelism in action:

enter image description here

And benchmarking this running on 20 CPUs against a serial rendition, this ran 16× faster. (It is not quite 20× as fast because of some modest overhead that parallelism entails.)

In short, we can enjoy parallelism with Swift concurrency, with the following caveats:

  1. One needs to have enough work on each thread to justify the overhead that parallelism entails. (This is why I chose a computationally intensive calculation for the above demonstration.)

  2. Please note that the Xcode simulator artificially constrains the cooperative thread pool, so one should always run parallelism tests on actual device with release build of the app.

Now, the OP’s code snippet is almost certainly not CPU bound, so this “does the Swift concurrency offer parallelism” question is somewhat academic. But his example will enjoy great performance benefits from running concurrently, nonetheless.

Rob
  • 415,655
  • 72
  • 787
  • 1,044
  • unfortunately this doesn't perform the tasks in parallel like concurrentPerform, merely concurrently. so u don't speed up processing by utilising multiple cpu's. – ngb Sep 23 '22 at 00:24
  • 2
    That’s simply not true. `withThrowingTaskGroup` definitely does run tasks in parallel (on separate cooperative pool worker threads). Now, what is true is that if you’re performing network requests, you’re not CPU bound, so it’s somewhat immaterial. – Rob Sep 23 '22 at 01:45
  • went back and had a look at this and seems u are right. I was basing what I said on this explanation: https://forums.swift.org/t/taskgroup-and-parallelism/51039/5. which echoed my own experience with this in that id often get all the tasks running in a single thread. However in going back looking it why there were probably other factors causing that. It seems concurrentPerform provides some guarantees for parallel processing. you are right on that for the OP's q its irrelevant. anyway your answer helped me move some more of my code away from gcd and understand taskgroup better so thanks. – ngb Sep 23 '22 at 03:27
  • 1
    @ngb I've expanded my answer above to reflect this (even though it is a little tangential). I have also clarified on that thread that you reference. – Rob Sep 23 '22 at 17:42