30

In a programming language that is purely functional (like Haskell) or where you are only using it in a functional way (eg clojure); suppose you have a list/seq/enumerable (of unknown size) of integers and you want to produce a new list/seq/enumerable that contains the differences between successive items, how would you do it?

What I did previously in C# was to fold over the list and keep a state object as the aggregating value which recorded the 'previous' item so that you could do a diff on it from the current item. The the result list also had to go into the state object (which is a problem for a list of unknown size)

What is the general approach for doing this kind of thing functionally?

Thomas
  • 5,047
  • 19
  • 30
Pieter Breed
  • 5,579
  • 5
  • 44
  • 60
  • What kind of a list is it? Is it a linked list where you can iterate through the list until you reach NIL? If so, why not iterate from the second item and in each iteration save the value of the previous element and calculate the difference and push it to the new list? – Odinn Mar 01 '12 at 08:02
  • @Odinn Care to put that in code? – Pieter Breed Mar 01 '12 at 09:57
  • 1
    I'd be interested in seeing the C# approach for comparison, especially if you're using LINQ to do this. – eternalmatt Mar 01 '12 at 20:32
  • 1
    Note that Data.List in Haskell contains a function for this specific case (generating a new list while keeping track of a state), its mapAccum[LR], with it, your function would be written `diffs = drop 1 . snd . mapAccumL (\x y -> (y, y-x)) 0`. Though the idiomatic code with zipWith is more readable in this particular case in my humble opinion. – Jedai Mar 02 '12 at 11:28
  • @eternalmatt added it in a separate answer for you – Pieter Breed Mar 05 '12 at 07:54

7 Answers7

33

In Haskell you would probably just use some higher order function like zipWith. So you could do something like this:

diff [] = []
diff ls = zipWith (-) (tail ls) ls

Note how I handled the [] case separately--if you pass an empty list to tail you get a runtime error, and Haskellers really, really hate runtime errors. However, in my function, I'm guaranteed the ls is not empty, so using tail is safe. (For reference, tail just returns everything except the first item of the list. It's the same as cdr in Scheme.)

This just takes the list and its tail and combine all of the items using the (-) function.

Given a list [1,2,3,4], this would go something like this:

zipWith (-) [2,3,4] [1,2,3,4]
[2-1, 3-2, 4-3]
[1,1,1]

This is a common pattern: you can compute surprisingly many things by cleverly using standard higher-order functions. You are also not afraid of passing in a list and its own tail to a function--there is no mutation to mess you up and the compiler is often very clever about optimizing code like this.

Coincidentally, if you like list comprehensions and don't mind enabling the ParallelListComp extension, you could write zipWith (-) (tail ls) ls like this:

[b - a | a <- ls | b <- tail ls]
Tikhon Jelvis
  • 67,485
  • 18
  • 177
  • 214
  • 11
    Or `diff xs = zipWith (-) (drop 1 xs) xs`. – augustss Mar 01 '12 at 08:31
  • 2
    Or `diff ls = zipWith (flip (-)) ls (tail ls)` – Landei Mar 01 '12 at 09:32
  • 4
    is7s's version crashes when given an empty list, and since it's pointfree you can't handle the `[]` case separately, as Tikhon does. augustss and Landei's versions both correctly handle the `[]` case without needing to handle it separately. I'd argue that augustss's version is the clearest of the five presented above. – dave4420 Mar 01 '12 at 09:52
  • @dave4420: Yeah, I like his version better than mine too. – Tikhon Jelvis Mar 01 '12 at 10:04
  • 1
    @dave4420 Yes, `drop 1` should be used for safety. – is7s Mar 01 '12 at 10:14
  • Or by pattern matching, no drop/tail required: `diff ls@(_:tls) = zipWith (-) tls ls` – Nathan Howell Mar 01 '12 at 21:00
  • @NathanHowell: That has the same problem as using `tail` because it doesn't handle `[]` properly. Also, I thought using `tail` or `drop 1` looks better, but that's obviously a matter of opinion. – Tikhon Jelvis Mar 01 '12 at 21:03
  • @TikhonJelvis yes, the empty list case would have to be handled per the original response. – Nathan Howell Mar 01 '12 at 21:05
  • I don't think this is better than augustss', but for completeness: `diff ls@(~(_:tls)) = zipWith subtract ls tls` – this is basically as evil as using tail, except that you'll get *slightly* better error messages if you screw up. – Ben Millwood Mar 16 '12 at 13:29
  • `diff == join (zipWith(-).drop 1) == drop 1 >>= zipWith (-) == ap (flip $ zipWith (-)) (drop 1) == ap (zipWith $ flip (-)) (drop 1)`. – Will Ness Jul 14 '12 at 10:23
26

In clojure, you can use the map function:

(defn diff [coll]
  (map - coll (rest coll)))
Jonas
  • 19,422
  • 10
  • 54
  • 67
  • 4
    +1 for a great example, also worth noting that this is lazy so the differences will only be calculated on demand. coll can even be an infinite sequence. – mikera Mar 01 '12 at 09:25
  • I had to go and paste this into a clojure REPL to convince myself that this is in fact valid clojure code. After squinting at it for a while I'm starting to make sense out of it. It's awesome, thank you! – Pieter Breed Mar 01 '12 at 09:53
  • It works since `map` truncates all sequence args to the length of the shortest one. Thus `coll` is truncated to contain elements `[0 .. N-2]` while `(rest coll)` contains elements `[1 .. N-1]`. `map` then applies the `-` function to corresponding elements from each sequence like `(- xi yi)` – Alan Thompson Dec 13 '19 at 22:40
  • Note that you can use `mapv` if you don't want a lazy result (returns a vector). – Alan Thompson Dec 13 '19 at 22:42
13

You can also pattern-match consecutive elements. In OCaml:

let rec diff = function
  | [] | [_]       -> []
  | x::(y::_ as t) -> (y-x) :: diff t

And the usual tail-recursive version:

let diff =
  let rec aux accu = function
  | [] | [_]       -> List.rev accu
  | x::(y::_ as t) -> aux ((y-x)::accu) t in
  aux []
Thomas
  • 5,047
  • 19
  • 30
  • @Fabrice Your edit is not in the spirit of edits on this site. I think your suggestion is an improvement, but what if the original answerer does not agree? What if he thinks it makes his answer worse? – Pascal Cuoq Mar 01 '12 at 21:54
7

For another Clojure solution, try

(map (fn [[a b]] (- b a))
     (partition 2 1 coll))
Retief
  • 3,199
  • 17
  • 16
5

Just to complement the idiomatic answers: it is possible in functional languages to process a list using a state object, just like you described. It is definitely discouraged in cases when simpler solutions exist, but possible.

The following example implements iteration by computing the new 'state' and passing it recursively to self.

(defn diffs
  ([coll] (diffs (rest coll) (first coll) []))
  ([coll prev acc]
     (if-let [s (seq coll)]
       ; new 'state': rest of the list, head as the next 'prev' and
       ; diffs with the next difference appended at the end:
       (recur (rest s) (first s) (conj acc (- (first s) prev)))
       acc)))

The state is represented in in the previous (prev) value from the list, the diffs computed so far (acc) and the rest of the list left to process (coll).

Rafał Dowgird
  • 43,216
  • 11
  • 77
  • 90
4

This is how it can be done in Haskell without any standard functions, just recursion and pattern matching:

diff :: [Int] -> [Int]
diff []     = []
diff (x:xs) = hdiff xs x


hdiff :: [Int] -> Int -> [Int]
hdiff []     p = []
hdiff (x:xs) p = (x-p):hdiff xs x
dave4420
  • 46,404
  • 6
  • 118
  • 152
nist
  • 1,706
  • 3
  • 16
  • 24
1

OK, here are two C# versions for those who are interested:

First, the bad version, or the one the previously imperative (in other words I) might try to write as functional programming is learnt:

  private static IEnumerable<int> ComputeUsingFold(IEnumerable<int> source)
  {
     var seed = new {Result = new List<int>(), Previous = 0};
     return source.Aggregate(
        seed,
        (aggr, item) =>
           {
              if (aggr.Result.Count > 0)
              {
                 aggr.Result.Add(item - aggr.Previous);   
              }
              return new { Result = aggr.Result, Previous = item };
           }).Result;
  }

Then a better version using the idioms expressed in other answers in this question:

  private static IEnumerable<int> ComputeUsingMap(IEnumerable<int> source)
  {
     return source.Zip(source.Skip(1), (f, s) => s - f);
  }

I am not sure, but it might be true that in this version the source enumerable is iterated over twice.

Pieter Breed
  • 5,579
  • 5
  • 44
  • 60