1

My question is very simple, how do I make this code lazy:

/*
input: [
    [1, 2],
    [3, 4],
    [5, 6]
]

output: [
    [1, 3, 5],
    [1, 3, 6],
    [1, 4, 5],
    [1, 4, 6],
    [2, 3, 5],
    [2, 3, 6],
    [2, 4, 5],
    [2, 4, 6],
]
*/

func combinations<T>(options: [[T]]) -> [[T]] {
    guard let head = options.first else {
        return [].map({ [$0] })
    }

    if options.count == 1 {
        return head.map({ [$0] })
    }

    let tailCombinations = combinations(options: Array(options.dropFirst()))

    return head.flatMap({ option in
        return tailCombinations.map({ combination -> [T] in
            return [option] + combination
        })
    })
}

The code above works to calculate the combinations, but it does so creating the entire array of arrays in memory. What I need is to have it return something like LazySequence<Array<T>>, except the Swift type system doesn't let me do something that generic.

Any ideas how to achieve this and keep the functional style?

Ps.: I did think of another way to solve this problem with generators and keeping track of indexes, but I don't wanna keep track of any state, I want a pure functional (as in FP) solution. Haskell does it by default, btw, and I'm looking for the same thing.

EDIT: I've managed to solve part of the problem, the type system, with AnyCollection

func combinations<T>(options: [[T]]) -> LazyCollection<AnyCollection<[T]>> {
    guard let head = options.first else {
        return AnyCollection([].lazy.map({ [$0] })).lazy
    }

    if options.count == 1 {
        return AnyCollection(head.lazy.map({ [$0] })).lazy
    }

    let tailCombinations = combinations(options: Array(options.dropFirst()))

    return AnyCollection(head.lazy.flatMap({ option in
        return tailCombinations.lazy.map({ [option] + $0 })
    })).lazy
}

But when I use the function, it loads the entire collection in memory, i.e., not lazy.

EDIT 2: Doing some more investigation, turns out the problem is with AnyCollection

// stays lazy
let x1 = head.lazy.flatMap({ option in
    return tailCombinations.lazy.map({ [option] + $0 })
})

// forces to load in memory
let x2 = AnyCollection(head.lazy.flatMap({ option in
    return tailCombinations.lazy.map({ [option] + $0 })
}))

Not sure how to solve this yet.

Rodrigo Ruiz
  • 4,248
  • 6
  • 43
  • 75

2 Answers2

3

Here is what I came up with:

func combinations<T>(options: [[T]]) -> AnySequence<[T]> {
    guard let lastOption = options.last else {
        return AnySequence(CollectionOfOne([]))
    }
    let headCombinations = combinations(options: Array(options.dropLast()))
    return AnySequence(headCombinations.lazy.flatMap { head in
        lastOption.lazy.map { head + [$0] }
    })
}

The main difference to this solution is that the recursive call creates a sequence of the first N-1 options, and then combines each element of that sequence with each element of the last option. This is more efficient because the sequence returned from the recursive call is enumerated only once, and not once for each element that it is combined with.

Other differences are:

  • There is no need to call .lazy on the AnySequence if that sequence is already lazy. The return type is therefore "simplified" to AnySequence<[T]>.
  • I have used CollectionOfOne to create a single-element sequence for the empty array.
  • Treating the case options.count == 1 separately is not necessary for the algorithm to work (but might be a possible performance improvement).

A completely different approach is to define a custom collection type which computes each combination as a function of the index, using simple modulo arithmetic:

struct Combinations<T> : RandomAccessCollection {
    let options: [[T]]
    let startIndex = 0
    let endIndex: Int

    init(options: [[T]]) {
        self.options = options.reversed()
        self.endIndex = options.reduce(1) { $0 * $1.count }
    }

    subscript(index: Int) -> [T] {
        var i = index
        var combination: [T] = []
        combination.reserveCapacity(options.count)
        options.forEach { option in
            combination.append(option[i % option.count])
            i /= option.count
        }
        return combination.reversed()
    }
}

No extra storage is needed and no recursion. Example usage:

let all = Combinations(options: [[1, 2], [3, 4], [5, 6]])
print(all.count)
for c in all { print(c) }

Output:

8
[1, 3, 5]
[1, 3, 6]
[1, 4, 5]
[1, 4, 6]
[2, 3, 5]
[2, 3, 6]
[2, 4, 5]
[2, 4, 6]

Testing with

let options = Array(repeating: [1, 2, 3, 4, 5], count: 5)

this collection-based method turned out to be faster then the my above sequence-based method by a factor of 2.

Martin R
  • 529,903
  • 94
  • 1,240
  • 1,382
  • I'm not sure I understand the performance gain, in my example, if I have 2 elements in the head and 10 in the tail, it will be 20 iterations. In your case, if you have 2 in the tail and 10 in the head, also 20 iterations. – Rodrigo Ruiz Jul 17 '17 at 05:46
  • @RodrigoRuiz: In your case, the tail sequence will be enumerated twice, once for each element in the head. I checked that by adding print statement into the `{ [option] + $0 }` closure. – Martin R Jul 17 '17 at 05:49
  • True, but in your case the lastOption will be enumerated 10 times, that's why I don't see the difference. It's either 2 * 10 or 10 * 2 (if you know what I mean). – Rodrigo Ruiz Jul 17 '17 at 05:50
  • @RodrigoRuiz: But lastOption is an *array,* so I *think* it should be faster to enumerate. I'll do some benchmarks later. – In any case, the other simplifications might be of some use. – Martin R Jul 17 '17 at 05:52
  • Actually I think your algorithm won't work. Did you run it? from what I see, the first `headCombinations` that actually gets used will be an empty array right? So your `flatMap` won't do any iterations. – Rodrigo Ruiz Jul 17 '17 at 06:01
  • @RodrigoRuiz: I tested it and it works. With options = Array(repeating: [1, 2, 3, 4, 5], count: 5)`it is approx 4x faster (tested on a MacBook in Release mode). – Martin R Jul 17 '17 at 06:03
  • The first headCombinations return a sequence with an empty array as the single element, this empty array is combined with all elements from lastOption – Martin R Jul 17 '17 at 06:04
  • Interesting, I'll try to, but still don't understand why it's faster, theoretically it should be the same number of iterations right? – Rodrigo Ruiz Jul 17 '17 at 06:07
  • @RodrigoRuiz: It is the same number of iterations. I *assume* that mapping an array is just faster than mapping a recursively defined, type-erased AnySequence. – Also note that repeatedly enumerating a sequences is *undefined behaviour,* see "Repeated Access" in https://developer.apple.com/documentation/swift/sequence, so your code works only by chance. – Martin R Jul 17 '17 at 06:51
  • I see, so I really do want `AnyCollection` instead of `AnySequence` – Rodrigo Ruiz Jul 17 '17 at 14:37
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/149395/discussion-between-martin-r-and-rodrigo-ruiz). – Martin R Jul 17 '17 at 14:46
1

I found one possible solution, but I'll leave this answer not accepted for a while to see if someone knows a better one.

func combinations<T>(options: [[T]]) -> LazySequence<AnySequence<[T]>> {
    guard let head = options.first else {
        return AnySequence([].lazy.map({ [$0] })).lazy
    }

    if options.count == 1 {
        return AnySequence(head.lazy.map({ [$0] })).lazy
    }

    let tailCombinations = combinations(options: Array(options.dropFirst()))

    return AnySequence(head.lazy.flatMap({ option in
        return tailCombinations.lazy.map({ [option] + $0 })
    })).lazy
}

The solution was to use AnySequence instead of AnyCollection. I'm not sure why though, I'd still like to have the AnyCollection interface rather than AnySequence, since it provides me with a few more methods, like count.

Rodrigo Ruiz
  • 4,248
  • 6
  • 43
  • 75