4

Given an array, like:

[0, 0.5, 0.51, 1.0, 1.5, 1.99, 2.0, 2.1, 2.5, 3.0] 

I want to cluster together values into subarrays based on their sequential differences (e.g., where abs(x-y) < n, and n = 0.2), e.g.:

[[0], [0.5, 0.51], [1.0], [1.5], [1.99, 2.0, 2.1], [2.5], [3.0]]. 

I'd like to do it declaratively — just to get a better grasp on how more complex sequence operations might work in a functional context (it seems like most "functional Swift" demos/tutorials are pretty basic).

Thanks in advance.


Update:

Here's a one-liner that's kinda close:

let times = [0, 0.5, 0.99, 1, 1.01, 1.5, 2, 2.5, 2.51, 3, 3.49, 3.5]

let result = times.map { t1 in
    return times.filter { fabs($0 - t1) < 0.2 }
}

// [[0.0], [0.5], [0.99, 1.0, 1.01], [0.99, 1.0, 1.01], [0.99, 1.0, 1.01], [1.5], [2.0], [2.5, 2.51], [2.5, 2.51], [3.0], [3.49, 3.5], [3.49, 3.5]]

Just need to get rid of duplicates.

jbm
  • 1,248
  • 10
  • 22
  • Sure that's not coming from an extension to Array provided by @Aaron or myself sitting in your environment? There is a split function on CollectionType taking a closure but the signature is (Self.Generator.Element) throws -> Bool. Your 2nd Update code segfaults 7.0.1 for me and shows Cannot convert value of type '(Double, Double) -> Bool' to expected argument type 'Double' in 7.1 beta 3. – Julian Oct 11 '15 at 10:00
  • oh crap, you're right. I'd forgotten you guys used "split" for your extensions (I renamed yours to split2)—as you suggest, I was intending to use the standard lib one. I'll get rid of it now. Makes my comment even more ironic; it really was "under my nose" (under my cursor, actually). I can't down vote it, unfortunately. Would you mind knocking it down one? – jbm Oct 11 '15 at 14:03

5 Answers5

4

A simple fold with an accumulating parameter works. Btw not sure if that's exactly what you want as I don't understand whether the elements in your array need to be subsequent. In the description you say so, but then your 'sample answer' doesn't take into account if they are subsequent. You should improve the question description.

let a : [Double] = [0, 0.5, 0.51, 1.0, 1.5, 1.99, 2.0, 2.1, 2.5, 3.0];
let diff : Double = 0.2;
let eps = 0.0000001

let b = a.sort().reduce(([],[])) { (ps : ([Double],[[Double]]), c : Double) -> ([Double],[[Double]]) in
  if ps.0.count == 0 || abs(ps.0.first! - c) - diff <= eps { return (ps.0 + [c], ps.1) } else { return ([c], ps.1 + [ps.0]) }
}
let result = b.1 + [b.0];
print(result)

Returns

[[0.0], [0.5, 0.51], [1.0], [1.5], [1.99, 2.0, 2.1], [2.5], [3.0]]
rafalio
  • 3,928
  • 4
  • 30
  • 33
  • Ah! Yes, this does look good. And yes, they would be sequential, which I think is pretty clear from the question... (Unless I've just gone blind—which is possible—the sample answer is sequential.) But thanks for the answer. I'll look it over in detail and try to wrap my head around what you're doing. Cheers. – jbm Oct 11 '15 at 04:04
  • Okay... While your version is **very** fast, compared to the others, it gets the wrong output with Rob's second test/input array: `[0.9, 1, 1.1, 2, 3]`. Your version outputs: `[[0.9, 1.0], [1.1], [2.0], [3.0]]`, as opposed to `[[0.9, 1.0, 1.1], [2.0], [3.0]]`. – jbm Oct 11 '15 at 04:10
  • @mrwheet good catch! ugh, I think that's floating point arithmetic for you... 0.1 is not representable in binary. (Try evaluating `(1.1 - 0.9) <= 0.2` in swift, it evaluates to false). I update the answer with an epsilon check. – rafalio Oct 11 '15 at 06:49
  • Wow, okay... good to know! I'm going to have to keep that in mind in future. – jbm Oct 11 '15 at 06:54
  • Okay, I've looked this over in more detail and decided to switch it to the "correct" answer. Not necessarily because it's concretely "better" than Aaron's, but simply because it's more thoroughly declarative (i.e., figuring it out taught me more about functional thinking!). I should revise, however, my above comment about speed—not sure why it appeared so much faster than Aaron's last night, since it seems about the same now. I'm assuming this is down to some Xcode Playground weirdness. Thanks again, everybody. – jbm Oct 11 '15 at 16:18
  • ugh... okay, I promise I'll shut up in a second. Just a "shout out" that Aaron and Julian's solutions do have the benefit of parameterizing the condition, which is a bonus for generalization. – jbm Oct 11 '15 at 16:24
  • Hey! @mrwheet Awesome, glad you liked it. Whenever I'm trying to do something 'functional' in swift, I try to think how I would do it in Haskell and then translate that. I very highly recommend you learn a bit of Haskell (read the free book at http://learnyouahaskell.com/) and it will massively improve your functional thinking and be applicable to swift too. Also, just recently the updated version of 'Functional programming in Swift' came out. I read the first version and can also highly recommend it. – rafalio Oct 11 '15 at 18:38
3

I don't know of any native Swift methods that do what you want. You can accomplish this with a simple extension though:

extension Array {
    func split(condition : (Element, Element) -> Bool) -> [[Element]] {
        var returnArray = [[Element]]()

        var currentSubArray = [Element]()

        for (index, element) in self.enumerate() {
            currentSubArray.append(element)

            if index == self.count - 1 || condition(element, self[index+1]) {
                returnArray.append(currentSubArray)
                currentSubArray = []
            }
        }

        return returnArray
    }
}

Example usage:

let source = [0, 0.5, 0.51, 1.0, 1.5, 1.99, 2.0, 2.1, 2.5, 3.0]
let n = 0.2
let target = source.split { abs($0 - $1) > n }

Output:

[[0.0], [0.5, 0.51], [1.0], [1.5], [1.99, 2.0, 2.1], [2.5], [3.0]]
Aaron Brager
  • 65,323
  • 19
  • 161
  • 287
  • Yes, that's quite nice. Thanks! I was trying to stick (slavishly) to the functional motto of avoiding loops, but I don't really know how I'd do that with this problem... I'll see if anyone else responds, just out of curiosity. I do a lot of symbolic music processing, so I do this kind of thing all the time, imperatively. Just exploring, really. Thanks again. – jbm Oct 11 '15 at 01:53
  • Thanks for the interesting question. I tried to stitch together a solution that used map/filter/reduce/split, but couldn't make it work simply. – Aaron Brager Oct 11 '15 at 02:19
3

This does it with reduce:

extension Array {
    func split(condition : (Element, Element) -> Bool) -> [[Element]] {
        return self.reduce([], combine:
            { (list : [[Element]], value : Element) in
                if list.isEmpty {
                    return [[value]]
                }
                else if !condition(list.last!.last!, value) {
                    return list[0..<list.count - 1] + [list.last!+[value]]
                }
                else {
                    return list + [[value]]
                }
            }
        )
    }
}

let source = [0, 0.5, 0.51, 1.0, 1.5, 1.99, 2.0, 2.1, 2.5, 3.0]
let n = 0.2
let target = source.split { abs($0 - $1) > n }

Output:

[[0], [0.5, 0.51], [1], [1.5], [1.99, 2, 2.1], [2.5], [3]]

Update

If you don't mind mutating the arrays in the reduce you get a shorter and presumably more efficient solution:

extension Array {
    func split(condition : (Element, Element) -> Bool) -> [[Element]] {
        return self.reduce([], combine:
            { ( var list : [[Element]], value : Element) in
                if list.isEmpty || condition(list.last!.last!, value) {
                    list += [[value]]
                }
                else {
                    list[list.count - 1] += [value]
                }
                return list
            }
        )
    }
}
Julian
  • 2,837
  • 17
  • 15
  • Yeah, that does it, too. Trickier to follow, mind you. – jbm Oct 11 '15 at 03:01
  • I'll be interested to see what others come up with. It does satisfy the requirements of your question though. It would also be interesting to come up with the same solution in Clojure or some other more strongly functional language. – Julian Oct 11 '15 at 03:15
  • Interestingly, mine and Julian's have roughly the same performance, while Aaron's is about twice as fast. Of course, mine's wrong (!! at this point anyway...) so that about does it for me! – jbm Oct 11 '15 at 03:16
  • The outer and inner arrays are constantly recreated in order to make it pure functional. I think you could cheat and not do that while still using reduce, and the performance would improve. – Julian Oct 11 '15 at 03:20
  • Any time you use reduce to generate an array, you're probably going to wind up O(n^2) in Swift, because it has to keep copying the array every time you call `.append()`. I'm certain my solution has just as terrible performance. Swift is not a functional language and lacks list-building primitives you need to do this efficiently. – Rob Napier Oct 11 '15 at 03:22
  • Yes, I was thinking the same thing. I guess this is just the kind of case that makes the combination of imperative and declarative a bonus of using Swift. – jbm Oct 11 '15 at 03:57
1

Swift isn't that bad at functional programming.

Here's one way to do it that avoids if/else statements and keeps the grouping condition isolated : (more legible and straightforward than the accepted answer IMHO)

let values:[Double] = [0, 0.5, 0.51, 1.0, 1.5, 1.99, 2.0, 2.1, 2.5, 3.0]

let inGroup     = { (x:Double,y:Double) return abs(x-y) < 0.2 }

let intervals   = zip(values,values.enumerated().dropFirst())            
let starts      = intervals.filter{ !inGroup($0,$1.1) }.map{$0.1.0}
let ranges      = zip([0]+starts, starts+[values.count])
let result      = ranges.map{values[$0..<$1]}

//  result :  [ [0.0], [0.5, 0.51], [1.0], [1.5], [1.99, 2.0, 2.1], [2.5], [3.0]  ]             

//  how it works ...
//
//  intervals:   Contains pairs of consecutive values along with the index of second one
//               [value[i],(index,value[i+1])] 
//
//  starts:      Contains the index of second values that don't meet the grouping condition 
//               (i.e. start of next group) 
//               [index] 
//        
//  ranges:      Contains begining and end indexes for group ranges formed using start..<end
//               [(Int,Int)]
//
//  result:      Groups of consecutive values meeting the inGroup condition
//                
Alain T.
  • 40,517
  • 4
  • 31
  • 51
0

I would absolutely do it the way Aaron Brager does it. That, IMO, is the best Swift approach. Swift does not actually play all that well with functional programming. But to explore how you might, here is one way I might attack it.

I totally expect the performance on this to be horrible. Probably O(n^2). Recursively building up an array, like I do in groupWhile, forces it to copy the whole array at every step.

// Really Swift? We have to say that a subsequence has the same elements as its sequence?
extension CollectionType where SubSequence.Generator.Element == Generator.Element {

    // Return the prefix for which pred is true, and the rest of the elements.
    func span(pred: Generator.Element -> Bool) -> ([Generator.Element], [Generator.Element]) {
        let split = indexOf{ !pred($0) } ?? endIndex
        return (Array(self[startIndex..<split]), Array(self[split..<endIndex]))
    }

    // Start a new group each time pred is false.
    func groupWhile(pred: Generator.Element -> Bool) -> [[Generator.Element]] {
        guard self.count > 0 else { return [] }
        let (this, that) = span(pred)
        let next = that.first.map{[$0]} ?? []
        let rest = Array(that.dropFirst())
        return [this + next] + rest.groupWhile(pred)
    }
}

extension CollectionType where
    Generator.Element : FloatingPointType,
    Generator.Element : AbsoluteValuable,
    SubSequence.Generator.Element == Generator.Element {

    //
    // Here's the meat of it
    //
    func groupBySeqDistanceGreaterThan(n: Generator.Element) -> [[Generator.Element]] {

        // I don't love this, but it is a simple way to deal with the last element.
        let xs = self + [Generator.Element.infinity]

        // Zip us with our own tail. This creates a bunch of pairs we can evaluate.
        let pairs = Array(zip(xs, xs.dropFirst()))

        // Insert breaks where the difference is too high in the pair
        let groups = pairs.groupWhile { abs($0.1 - $0.0) < n }

        // Collapse the pairs down to values
        return groups.map { $0.map { $0.0 } }
    }
}
Rob Napier
  • 286,113
  • 34
  • 456
  • 610
  • Ha! Interesting. Yes, that's pretty tricky. So far I think you're definitely right about Aaron's solution, though. It's both the cleanest and the fastest so far, and by a fair margin. – jbm Oct 11 '15 at 03:27
  • BTW, @Rob, do you know if there's a way to filter the repetitions from my version above? It seems like another filter might be possible, testing for equality between the subarrays. No? – jbm Oct 11 '15 at 03:29
  • I don't think yours is correct by your description. Consider `[1, 2, 1.1, 3, 0.9]`. By your description, it should be `[[1], [2], [1.1], [3], [0.9]]`, but your algorithm gives `[[1, 1.1, 0.9], [2], [1, 1.1], [3], [1, 0.9]]`. – Rob Napier Oct 11 '15 at 03:36
  • True, but I will always have a sorted array—in my "real life" application, they would be timestamps for notes, and I'd be grouping them together into chords; i.e., notes that occur at approximately the same time. Mind you, mine just isn't correct anyway! ;) ...But with the repetitions removed, it should work as required. – jbm Oct 11 '15 at 03:48
  • 1
    Oops. Actually, you're right. Even sorted it would be: `[[0.9, 1.0], [0.9, 1.0, 1.1], [1.0, 1.1], [2.0], [3.0]]`. So it's more than a little bit wrong! Mind you, I've pretty much decided that Aaron's approach is the way to go anyway. – jbm Oct 11 '15 at 03:52