0

If you have an async method that produces a value, what is the most efficient way to iterate over that method to produce an array of values?

protocol ImageFetching {
    var title: String { get }
    func fetch(from url: URL) async throws -> UIImage
    func fetch(from urls: [URL]) async throws -> [UIImage]
}

This class produces an array by using a TaskGroup that iterates over the urls and calls its single-value method fetch(from:) -> UIImage.

class ImageService: ImageFetching {

    enum Error: Swift.Error {
        case badData
    }

    var title: String { String(describing: Self.self) }

    func fetch(from url: URL) async throws -> UIImage {
        let session = URLSession(configuration: .default)
        let data = try await session.data(from: url).0
        guard let image = UIImage(data: data) else { throw Error.badData }
        return image
    }

    func fetch(from urls: [URL]) async throws -> [UIImage] {
        return try await withThrowingTaskGroup(of: UIImage.self) { group in
            for url in urls {
                group.addTask {
                    try await self.fetch(from: url)
                }
            }

            var items: [UIImage] = []
            for try await item in group {
                items.append(item)
            }
            return items
        }
    }
}

This class does not call the single-value method. Instead, it invokes URLSession.data(from:) and UIImage(data:) manually. Oddly, this is less efficient than the ImageService implementation above.

class ImageServiceUnrolled: ImageService {

    override func fetch(from urls: [URL]) async throws -> [UIImage] {
        let session = URLSession(configuration: .default)
        return try await withThrowingTaskGroup(of: UIImage.self) { group in
            for url in urls {
                group.addTask {
                    let data = try await session.data(from: url).0
                    guard let image = UIImage(data: data) else { throw Error.badData }
                    return image
                }
            }

            var items: [UIImage] = []
            for try await item in group {
                items.append(item)
            }
            return items
        }
    }
}

This class uses an AsyncStream with the unfolding initializer. This is the least efficient approach.

class ImageServiceAsyncStreamUnfolding: ImageService {

    override func fetch(from urls: [URL]) async throws -> [UIImage] {
        var it = urls.makeIterator()
        return try await AsyncStream(unfolding: {
            it.next()
        })
        .map { url -> UIImage in
            try await self.fetch(from: url)
        }
        .reduce(into: [], { partialResult, image in
            partialResult.append(image)
        })
    }
}

This class uses an AsyncStream with the continuation initializer. It is slightly more efficient than the implementation above.

class ImageServiceAsyncStreamContinue: ImageService {

    override func fetch(from urls: [URL]) async throws -> [UIImage] {
        return try await AsyncStream.init { continuation in
            Task.detached {
                for url in urls {
                    continuation.yield(url)
                }
                continuation.finish()
            }
        }
        .map { url -> UIImage in
            try await self.fetch(from: url)
        }
        .reduce(into: [], { partialResult, image in
            partialResult.append(image)
        })
    }
}

I'm curious why the first implementation is the most efficient, especially why it is more efficient that the second implementation. As for the AsyncStream implementations, they seem to be very inefficient.

Is using TaskGroup the best way to tackle this?

Here are the times (in milliseconds) that I'm observing when downloading 10 images from the web on each iteration.

Elapsed [ImageService]: 12.93
Elapsed [ImageService]: 13.00
Elapsed [ImageService]: 16.74
Elapsed [ImageServiceUnrolled]: 16.41
Elapsed [ImageServiceUnrolled]: 19.06
Elapsed [ImageServiceUnrolled]: 17.88
Elapsed [ImageServiceAsyncStreamUnfolding]: 28.76
Elapsed [ImageServiceAsyncStreamUnfolding]: 55.86
Elapsed [ImageServiceAsyncStreamUnfolding]: 28.47
Elapsed [ImageServiceAsyncStreamContinue]: 29.55
Elapsed [ImageServiceAsyncStreamContinue]: 28.53
Elapsed [ImageServiceAsyncStreamContinue]: 27.05

Rob C
  • 4,877
  • 1
  • 11
  • 24
  • 1
    (1) What does "efficient" mean? (2) Aren't you in danger of confusing your code with the network? – matt Jan 08 '22 at 14:04

1 Answers1

0

Unfortunately, I can't benchmark the result because You didn't provide the sample project. But you can try this extension:

extension Sequence {

    func concurrentMap<T>(
        _ transform: @escaping (Element) async throws -> T
    ) async throws -> [T] {
        let tasks = map { element in
            Task { try await transform(element) }
        }

        return try await tasks.asyncMap { task in
            try await task.value
        }
    }
}

And use it like:

func fetch(from urls: [URL]) async throws -> [UIImage] {
    try await urls.concurrentMap(fetch)
}

The only notes I have here are:

  • Remember that async task may result in the result in a different order
  • You may NOT need to download all images and consider using a lazy load approach instead
  • Network-based tasks are NOT isolated to just one thing to benchmark! it would be best if you considered testing it in an isolated environment where there are no other parameters like speed, latency,

Some studies show that some manual dispatchs are more efficient than Swift's current implementation of the async/await. But Swift is under enhancement every day and someday, it may change under the hood. Choose your approach wisely

Mojtaba Hosseini
  • 95,414
  • 31
  • 268
  • 278