highest product of 3 implementation in haskell

Question

I'd like the algorithm for highest product of 3 problem implemented in haskell. Here's the problem statement:

Given an array of integers, find the highest product you can get from three of the integers.

For example given [1, 2, 3, 4], the algorithm should return 24. And given [-10, -10, 5, 1, 6], the highest product of 3 would be 600 = -10*-10*6.

My attempt (assumed no negatives for the first try):

sol2' a b c []     = a*b*c
sol2' a b c (x:xs) = sol2' a' b' c' xs
  where 
    a' = if (x > a) then x else a
    b' = if (x > a && a > b) then a else b
    c' = if (x > a && a > b && b > c) then b else c

sol2 li = sol2' a b c li
  where a = 0
        b = 0 
        c = 0

I tested the implementation with [3, 5, 1, 2, 4, 10, 0, 4, 8, 11] but the return value is 550, which is supposed to be 880.

Hint: what do we know about the element that produce the highest product? — Willem Van Onsem, Nov 08 '17 at 18:43
Your logic for `a',b',c'` looks wrong. Suppose `a=8,b=5,c=2` and `x=7` comes. Do you get `a'=8,b'=7,c'=5` for the next iteration ? — chi, Nov 08 '17 at 18:46
Hint: try to define all the variables together as in `where (a',b',c') = if (...) then (x,a,b) else if (...) then (a,x,b) else ...` (you could also use guards for that) — chi, Nov 08 '17 at 18:48
@WillemVanOnsem that they are the three maximum numbers in the list or the maximum and two minimum negative numbers? — Aria Pahlavan, Nov 08 '17 at 18:57
@chi I was trying to define all of those variables but I was getting compilation errors which were very confusing. but the idea is to use pattern-matching for returning the result in the if-then-else expressions, correct? — Aria Pahlavan, Nov 08 '17 at 18:59
There's no pattern matching in my comment above, except for `(a',b',c') =`. The `if` part simply evaluates to a 3-tuple, it does not pattern match anything. — chi, Nov 08 '17 at 19:02

Willem Van Onsem · Accepted Answer · 2017-11-08T19:47:14.883

Positive numbers

You are on the right track in the sense that you look for the highest numbers. The problem is however that a, b and c are not always ordered.

Indeed say for instance that we have the numbers [6,2,4]. Then the way (a,b,c) will evolve through the recursion is:

(0,0,0) -> (6,0,0) -> (2,6,0) -> (4,2,6)

But now a=4, so that means that if we now encounter 3, we will not replace that value, whereas we can do this since we can remove the 2.

Although there are many ways to solve this, probably the best way to do this is to maintain order: ensure that a <= b <= c.

So we can use:

sol1 = sol2' (0,0,0)

sol2' (a,b,c) []     = a*b*c
sol2' t@(a,b,c) (x:xs) = sol2' f xs
  where f | x >= c = (b,c,x)
          | x >= b = (b,x,c)
          | x > a = (x,b,c)
          | otherwise = t

this produces the expected:

Prelude> sol1 [1,2,3,4]
24
Prelude> sol1 [3, 5, 1, 2, 4, 10, 0, 4, 8, 11]
880

Intermezzo: keep track of numbers if negative are present

Your program first takes (0,0,0) as the first three values. But in case for instance the list contains only negative numbers (i.e. [-1,-2,-3]) we of course want to keep track of these first. We can do this for instance by initializing our tuple with elements from the list:

import Data.List(sort)

sol1 (xa:xb:xc:xs) = sol2' (a,b,c) xs
    where [a,b,c] = sort [xa,xb,xc]

So now we take the first three elements, sort these, and use these as the first tuple. The remaining of the list is processed. This function will error in case sol1 is not given a list with at least three elements, but in that case there probably is no answer. We can use a Maybe to handle the fact that the function is non-total.

all numbers

Of course we also want to deal with negative numbers. Multiplying two negative numbers results in a positive number. So by keeping track of the two smallest numbers as well, we can then do the math properly. So first we will use another argument (d,e) to keep track of the smallest numbers with d <= e:

sol1_all = sol2_all' (0,0,0) (0,0)

sol2_all' (a,b,c) (d,e) []     = -- ...
sol2_all' t@(a,b,c) u@(d,e) (x:xs) = sol2_all' f g xs
  where f | x >= c = (b,c,x)
          | x >= b = (b,x,c)
          | x > a = (x,b,c)
          | otherwise = t
        g | x <= d = (x,d)
          | x <= e = (d,x)
          | otherwise = u

So now we have obtained the greatest numbers (a,b,c) and the smallest numbers (d,e). If d and e are indeed negative, then the only way to produce a large . So now we have the following possibilities to consider a*b*c and c*d*e. So we can write it as:

sol2_all' (a,b,c) (d,e) [] = max (a*b*c) (c*d*e)
sol2_all' t@(a,b,c) u@(d,e) (x:xs) = sol2_all' f g xs
  where f | x >= c = (b,c,x)
          | x >= b = (b,x,c)
          | x > a = (x,b,c)
          | otherwise = t
        g | x <= d = (x,d)
          | x <= e = (d,x)
          | otherwise = u

Note however that this will not always produce the correct result here because we can count two numbers in both tuples. We can solve this by properly initializing the tuples:

import Data.List(sort)

sol1_all (xa:xb:xc:xs) = sol2_all' (a,b,c) (a,b) xs
    where [a,b,c] = sort [xa,xb,xc]

sol2_all' (a,b,c) (d,e) [] = max (a*b*c) (c*d*e)
sol2_all' t@(a,b,c) u@(d,e) (x:xs) = sol2_all' f g xs
  where f | x >= c = (b,c,x)
          | x >= b = (b,x,c)
          | x > a = (x,b,c)
          | otherwise = t
        g | x <= d = (x,d)
          | x <= e = (d,x)
          | otherwise = u

Rationale behind picking different (possibly equivalent) elements

How do we know that we will not use an element twice? Since we only use a*b*c or c*d*e this will - in the case of a list with three element - boils down to max(a*b*c,a*b*c) (a, b, and c here the result of sort). So uniqueness is guaranteed. Since we will only add elements in the first tuple if these are at least greater than a, and less than b, we know that in order for an x to be added in both tuples, it should be a <= x <= b. In that case we will obtain tuples (x,b,c) and (a,x). But since we evaluate in that case x*b*c and a*x*c, x will thus not occur in any expression twice.

Leetcode challenge

I submitted a Python version of this code to the Leetcode Challenge and it was accepted:

class Solution:
    def maximumProduct(self, nums):
        a,b,c = d,e,_ = sorted(nums[:3])
        for x in nums[3:]:
            if x >= c:
                a,b,c = b,c,x
            elif x >= b:
                a,b = b,x
            elif x >= a:
                a = x
            if x <= d:
                d,e = x,d
            elif x < e:
                e = x
        return max(a*b*c,c*d*e)

Also if the list could include negative numbers, then would `sol2` need to have two more parameters to keep track the min numbers? — Aria Pahlavan, Nov 08 '17 at 19:08
@AriaPahlavan I think things get quite a bit more complicated if negative numbers are allowed. There's all kinds of fun edge cases -- e.g. perhaps there are *only* negative numbers, in which case you want the biggest ones again instead of the smallest ones. You should also think carefully about when including 0 might be the right answer. The simplest (though inefficient) is to compute all the products: `maxProd3 xs = maximum [x*y*z | x:ys <- tails xs, y:zs <- tails ys, z <- zs]`. (Even if you don't use that implementation in the end, it's a good one to test against with QuickCheck or similar.) — Daniel Wagner, Nov 08 '17 at 19:15
@DanielWagner would become that complicated? I can't think of that many corner cases though. Anyhow, `maxProd3` is O(N^3) for time and space complexity, isn't it? — Aria Pahlavan, Nov 08 '17 at 19:26
@DanielWagner: it is possible by keeping track of the two lowest numbers. — Willem Van Onsem, Nov 08 '17 at 19:36
@AriaPahlavan "Would become that complicated?" The simplest obviously correct formulation of a solution that I can think of has four cases. "`maxProd3` is O(N^3), isn't it?" Yes, as I mentioned, it is inefficient; but useful as a specification to test against. — Daniel Wagner, Nov 08 '17 at 19:36
@DanielWagner: or we can hope that *Leetcode* has a good judging system https://leetcode.com/problems/maximum-product-of-three-numbers/description/ :) Leetcode also has its own solution (I obtained after succesful submission), but I don't know if I'm allowed to share this. — Willem Van Onsem, Nov 08 '17 at 19:39
Interesting. The argument that this is correct is pretty subtle. I like it a lot! You might like `where a:b:c:_ = coerce (sort :: [Down a] -> [Down a]) xs; d:e:_ = sort xs` (with `ScopedTypeVariables`) or similar as a prettier way of getting the three largest and two smallest elements (still in linear time). Full details [here](http://lpaste.net/359898). — Daniel Wagner, Nov 08 '17 at 19:51
I agree it's a very subtle argument and explanation. Thanks! — Aria Pahlavan, Nov 08 '17 at 20:40
And I like the python soln too. My Java implementation was a bit more verbous/noisy, which kinda bothers me — Aria Pahlavan, Nov 08 '17 at 20:42

Karl Bielefeldt · Answer 2 · 2017-11-08T22:46:41.740

1

There are somewhat more efficient solutions, but I would lean toward something more straightforward like:

import Data.List (subsequences)
f :: (Num a, Ord a) => [a] -> a
f = maximum . map product . filter ((==3) . length) . subsequences

Thinking about functional algorithms as sequences of transformations on collections makes them much more idiomatic than transforming imperative loops into recursive functions.

Note if you are doing this with really long lists where efficiency is a concern, you can sort the list first, then take the lowest two and the highest three, and the algorithm will still work:

takeFirstLast xs = (take 2 sorted) ++ (drop (length sorted - 3) sorted)
  where sorted = sort xs

However, my original way is plenty fast up to lists of size 100 or so, and is a lot easier to understand. I don't believe in sacrificing readability for speed until I'm told it's an actual requirement.

edited Nov 08 '17 at 22:46

answered Nov 08 '17 at 20:41

Karl Bielefeldt

47,314
10
60
94

interesting! could you elaborate on how the list is being transformed to achieve the result a bit more please – Aria Pahlavan Nov 08 '17 at 21:08
1

That, however, is rather ridiculously inefficient. A list of length n has 2^n subsequences. Roughly half of them have length at least n/2, so we're talking O(n*2^n). Youch! Compare that to Willem Van Onsem's solution, which is O(n). That's not just "somewhat" more efficient. There really is no way to use `subsequences` here without going exponential. – dfeuer Nov 08 '17 at 21:17
I agree that readability is very important, but O(n*2^n) is just very hurtful; also the more efficient solution by Willem Van Onsem is pretty readable too, though this solution uses more "functional". – Aria Pahlavan Nov 09 '17 at 17:29