clojure: partition a seq based on a seq of values

Question

I would like to partition a seq, based on a seq of values

(partition-by-seq [3 5] [1 2 3 4 5 6]) 
((1 2 3)(4 5)(6))

The first input is a seq of split points. The second input is a seq i would like to partition. So, that the first list will be partitioned at the value 3 (1 2 3) and the second partition will be (4 5) where 5 is the next split point.

another example:

(partition-by-seq [3] [2 3 4 5])
result: ((2 3)(4 5))

(partition-by-seq [2 5] [2 3 5 6])
result: ((2)(3 5)(6))

given: the first seq (split points) is always a subset of the second input seq.

What do you mean by 'split-points'? Is the sequence supposed to represent a range? — Lee, Mar 17 '15 at 15:06
split points are values on which the input seq gets partitioned. (partition-by-seq [3] [1 2 3 4 5 6]) will result in ((1 2 3) (4 5 6)) — Thomas Deutsch, Mar 17 '15 at 15:15
And what have you tried? Or you expect us to do your homework for you? — m0skit0, Mar 17 '15 at 15:23
i tried this: (filter #(< 1 (count %)) (partition-by (partial contains? first-ticks) (sort free-ticks))) which comes close, but i am missing the split points in my result seqs. — Thomas Deutsch, Mar 17 '15 at 15:25
thanks for the question. all seqs are sorted and the values are unique — Thomas Deutsch, Mar 17 '15 at 15:27
as a clojure beginner, i try to build a time-management app. the first seq are time-points on which i have an appointment. the second seq is a list of free-time + the start times of an appointment. — Thomas Deutsch, Mar 17 '15 at 15:32

score 1 · Answer 1 · answered Mar 17 '15 at 17:17

the sequence to be partitioned is a splittee and the elements of split-points (aka. splitter) marks the last element of a partition.

from your example:

splittee: [1 2 3 4 5 6]

splitter: [3 5]

result: ((1 2 3)(4 5)(6))

Because the resulting partitions is always a increasing integer sequence and increasing integer sequence of x can be defined as start <= x < end, the splitter elements can be transformed into end of a sequence according to the definition.

so, from [3 5], we want to find subsequences ended with 4 and 6.

then by adding the start, the splitter can be transformed into sequences of [start end]. The start and end of the splittee is also used.

so, the splitter [3 5] then becomes:

[[1 4] [4 6] [6 7]]

splitter transformation could be done like this

(->> (concat [(first splittee)] 
              (mapcat (juxt inc inc) splitter) 
              [(inc (last splittee))])
     (partition 2)

there is a nice symmetry between transformed splitter and the desired result.

[[1 4] [4 6] [6 7]]

((1 2 3) (4 5) (6))

then the problem becomes how to extract subsequences inside splittee that is ranged by [start end] inside transformed splitter

clojure has subseq function that can be used to find a subsequence inside ordered sequence by start and end criteria. I can just map the subseq of splittee for each elements of transformed-splitter

(map (fn [[x y]]
       (subseq (apply sorted-set splittee) <= x < y))
     transformed-splitter)

by combining the steps above, my answer is:

(defn partition-by-seq 
  [splitter splittee]
  (->> (concat [(first splittee)]
                (mapcat (juxt inc inc) splitter)
                [(inc (last splittee))])
       (partition 2)
       (map (fn [[x y]]
              (subseq (apply sorted-set splittee) <= x < y)))))

wow, thank you very mutch. i can learn a lot from your example. — Thomas Deutsch, Mar 17 '15 at 17:20

score 1 · Accepted Answer · answered Mar 18 '15 at 16:40

I came up with this solution which is lazy and quite (IMO) straightforward.

(defn part-seq [splitters coll]
  (lazy-seq
   (when-let [s (seq coll)]
     (if-let [split-point (first splitters)]
       ; build seq until first splitter
       (let [run (cons (first s) (take-while #(<= % split-point) (next s)))]
         ; build the lazy seq of partitions recursively
         (cons run
               (part-seq (rest splitters) (drop (count run) s))))
       ; just return one partition if there is no splitter 
       (list coll)))))

If the split points are all in the sequence:

(part-seq [3 5 8] [0 1 2 3 4 5 6 7 8 9])
;;=> ((0 1 2 3) (4 5) (6 7 8) (9))

If some split points are not in the sequence

(part-seq [3 5 8] [0 1 2 4 5 6 8 9])
;;=> ((0 1 2) (4 5) (6 8) (9))

Example with some infinite sequences for the splitters and the sequence to split.

(take 5 (part-seq (iterate (partial + 3) 5) (range)))
;;=> ((0 1 2 3 4 5) (6 7 8) (9 10 11) (12 13 14) (15 16 17))

Thomas Deutsch · Answer 3 · 2015-03-17T17:27:21.040

This is the solution i came up with.

(def a [1 2 3 4 5 6])
(def p [2 4 5])

(defn partition-by-seq [s input]
  (loop [i 0 
         t input
         v (transient [])]
    (if (< i (count s))
        (let [x (split-with #(<= % (nth s i)) t)]
          (recur (inc i) (first (rest x)) (conj! v (first x))))
      (do
        (conj! v t)
        (filter #(not= (count %) 0) (persistent! v))))))

(partition-by-seq p a)

clojure: partition a seq based on a seq of values

3 Answers3