How to fold on x elements of list per fold

Question

So, let us say we have some list like as follows: [1; 2; 3; 4; 5; 6], and let us say that I want to fold on 2 elements per call of the function.

So, I would apply the function on (1, 2), (3, 4), and (5, 6) in order.

Here was my attempt at a function to do so:

let fold_left_multiple (func: 'a -> 'b list -> 'a) (base: 'a) (lst: 'b list) (items_per_fold: int): 'a * 'b list =
    let (acc, remainder, _) = List.fold_left (fun (acc, cur_fold_acc, cur_num) el ->
        if cur_num mod items_per_fold = 0 then (func acc (List.rev (el::cur_fold_acc)), [], 1)
        else (acc, el::cur_fold_acc, cur_num + 1)
    ) (base, [], 1) lst in (acc, remainder)

This somewhat works; however, the problem with this is that it is not easy to use these elements in a function.

My preferred implementation would somehow use a tuple or array to make element access easier.

Here is an example of input/output that would be better expected (using utop syntax). In this case, I am summing up each pair of elements.

# fold_left_multiple (fun lst (e1, e2, e3) -> (e1 + e2 + e3)::lst) [] [1; 2; 3; 4; 5; 6; 7; 8] 3;;
- : int list * int list = ([15; 6], [7; 8])

Here, the remaining elements if the list's length is not evenly divisible by n are put into the second element of the tuple.

(I would not mind if this remainder is reversed in a solution.)

You can use first `List.map` to transform into a list of n-uples, and then use `List.fold` on the list of n-uples. — Anthony Scemama, Oct 05 '20 at 20:57
@AnthonyScemama I cannot visualize how I would do that mapping since n depends on what is inputted. — Gigi Bayte 2, Oct 05 '20 at 21:10
@SabdulUlahi you should try making a function that takes the first n elements of a list and returns these elements and the rest of the list, then plug it into a `fold_left_n` implementation similar to `fold_left_2` in @glennsl's answer. — Martin Jambon, Oct 05 '20 at 22:01
@MartinJambon The first part of your comment makes sense and is simple enough to do, but how could you make a `fold_left_n` implementation that works like glennsl's when the match patterns are length dependent? Also, the function would need to take in `n` parameters instead of a data structure like a tuple that could contain them, which would be a bit inconvenient for larger sizes of `n`. — Gigi Bayte 2, Oct 05 '20 at 22:08
@SabdulUlahi you can't do it with tuples, for sure. Instead you'd use lists or arrays. — Martin Jambon, Oct 06 '20 at 02:08

score 1 · Answer 1 · answered Oct 05 '20 at 23:20

Tuples are handy if there is a known and limited number of slots. But they do become quite unwieldy once this is not the case. Thus, I think there is nothing wrong with having the folder function receive a sub-list of the input list.

The usual way to get the first n elements (or less) in functional languages is by means of a function called take. Respectively, the usual way to get rid of the first n elements (or less) is by means of a function named drop.

With the help of those 2 functions, the function you want could be implemented like this:

(* take and drop seem to be missing in ocamls half full batteries... 
   maybe because it is not idiomatic or efficient or both... 
 *)
let take n lst =
  let rec loop acc n l =
    match n with
    | 0 -> List.rev acc
    | x ->
       match l with
       | [] -> List.rev acc
       | x::xs -> loop (x::acc) (n-1) (List.tl l) in
  loop [] n lst

let drop n lst =
  let rec loop n l =
    match n with
    | 0 -> l
    | _ ->
       match l with
       | [] -> l
       | _::_ -> loop (n-1) (List.tl l) in
  loop n lst


let fold_windowed folder wsize acc lst =
  let rec loop acc l =
    match l with
    | [] -> List.rev acc
    | _::_ ->
       loop (folder acc (take wsize l)) (List.tl l) in
  loop acc lst

With a little help of some additional functions I am used to in F# but could not find out of the box in Ocaml, you can use fold_windowed as follows:

let id x = x (* ocaml should have that right out of the box... *)

(* shamelessly derived from F# List.init, with the diff, that the name initializer 
   seems to be reserved in ocaml, hence the somewhat silly name 'initor'
 *)
let list_init n initor =
  let rec loop acc i =
    match i with
    | 0 -> acc
    | _ -> loop ((initor i)::acc) (i-1) in
  loop [] n

# fold_windowed (fun acc l -> l::acc) 3 [] (list_init 10 id);;
_ : int list list =
[[1; 2; 3]; [2; 3; 4]; [3; 4; 5]; [4; 5; 6]; [5; 6; 7]; [6; 7; 8]; [7; 8; 9];
[8; 9; 10]; [9; 10]; [10]]

Good answer, but it doesn't quite answer the question. The sublists should not be overlapping. — glennsl, Oct 06 '20 at 19:57
Oh - well - it is a minimal change ... Just replace `(List.tl l)` with ` (drop wsize l) ` and you have it non-overlapping. — BitTickler, Oct 06 '20 at 20:34

glennsl · Accepted Answer · 2020-10-05T23:33:32.270

You could just modify the standard fold_left function to operate on multiple elements. Here's one that operates on pairs:

let rec fold_left_2 f acc l =
  match l with
  | a::b::rest -> fold_left_2 f (f acc a b) rest
  | remainder -> (acc, remainder)

Edit: Modified to return the remainder, as asked, instead of ignoring it.

To illustrate my point in the comments that generalizing this to an arbitrary number of elements is possible, but not very beneficial, here's an implementation that allows the input list to be split arbitrarily using a split function:

let rec fold_left_n splitf f acc l =
  match splitf l with
  | None, remainder -> (acc, remainder)
  | Some x, rest -> fold_left_n splitf f (f acc x) rest

And its invocation using your example:

fold_left_n
  (function a::b::c::rest -> (Some (a, b, c), rest) | remainder -> (None, remainder))
  (fun lst (e1, e2, e3) -> (e1 + e2 + e3)::lst) [] [1; 2; 3; 4; 5; 6; 7; 8];;

Similarly, a function could be written that extracts sublists of arbitrary length which I haven't bothered to implement, but its invocation would look something like this:

fold_left_n 3
  (fun lst -> function
    | [e1, e2, e3] -> (e1 + e2 + e3)::lst
    | _ -> lst (* we assume we're getting a 3-element list, but the compiler doesn't know that so we need to ignore everything else *)
  ) [] [1; 2; 3; 4; 5; 6; 7; 8];;

They're both very complex and verbose in use and provide little benefit over just writing specialized implementations.

This works well for pairs! Though, I was hoping to further generalize it to every n elements whereas this only seems to work for n = 2. — Gigi Bayte 2, Oct 05 '20 at 21:19
I don't think there's much to generalize here. Not in a way that would reduce the amount of code consumers would have to write a t least. You could either write a fold function that accepts another function to split the list into tuples, or you one that would extract sublists of arbitrary length and pass that to the fold function, but either way the consumer will have to pattern match on the list to extract the elements and account for the possibility of the list having fewer than the expected number of elements. Which ends up being about as much code as writing tailor-made fold functions. — glennsl, Oct 05 '20 at 22:06
I've updated the answer to illustrate what that might look like, and why I don't think it's beneficial. — glennsl, Oct 05 '20 at 22:37

score 0 · Answer 3 · answered Oct 05 '20 at 22:04

It might help to decide what type you want your function to have.

There is no type that represents tuples with different numbers of elements, even if all the elements are ints. Each number of elements is a different type: int * int, int * int * int, etc.

If you want to write a general function then your folded function will need to get its input in some form other than a tuple--a list perhaps.

How to fold on x elements of list per fold

3 Answers3