Comparing list length with arrows

Question

If I want to find the longest list in a list of lists, the simplest way is probably:

longestList :: [[a]] -> [a]
longestList = maximumBy (comparing length)

A more efficient way would be to precompute the lengths:

longest :: [[a]] -> [a]
longest xss = snd $ maximumBy (comparing fst) [(length xs, xs) | xs <- xss]

Now, I want to take it one step further. It may not be more efficient for normal cases, but can you solve this using arrows? My idea is basically, step through all of the lists simultaneously, and keep stepping until you've overstepped the length of every list except the longest.

longest [[1],[1],[1..2^1000],[1],[1]]

In the forgoing (very contrived) example, you would only have to take two steps through each list in order to determine that the list [1..2^1000] is the longest, without ever needing to determine the entire length of said list. Am I right that this can be done with arrows? If so, then how? If not, then why not, and how could this approach be implemented?

@luqui a section of the Haskell Wikibook on [Using arrows](http://en.wikibooks.org/wiki/Haskell/Understanding_arrows#Using_arrows) seemed to state that arrows were useful for parsing in a way which is similar to my proposed solution to this problem (look at the first element of each list, then the second, etc) [Stephen's Arrow Tutorial](http://en.wikibooks.org/wiki/Haskell/StephensArrowTutorial) gave me the same vibe: that arrows could be used to dig into these lists and store information as they go. — Dan Burton, Oct 12 '11 at 04:04
I've accepted an answer, but if someone can whip up an answer with arrows, or thoroughly explain why arrows are not pertinent, then I will certainly accept that one instead. — Dan Burton, Oct 18 '11 at 20:28

score 4 · Answer 1 · answered Oct 11 '11 at 18:58

OK, as I was writing the question, it dawned on me a simple way to implement this (without arrows, boo!)

longest [] = error "it's ambiguous"
longest [xs] = xs
longest xss = longest . filter (not . null) . map (drop 1) $ xss

Except this has a problem...it drops the first part of the list and doesn't recover it!

> take 3 $ longest [[1],[1],[1..2^1000],[1]]
[2,3,4]

Needs more bookkeeping :P

longest xs = longest' $ map (\x -> (x,x)) xs

longest' []   = error "it's ambiguous"
longest' [xs] = fst xs
longest' xss  = longest . filter (not . null . snd) . map (sndMap (drop 1)) $ xss

sndMap f (x,y) = (x, f y)

Now it works.

> take 3 $ longest [[1],[1],[1..2^1000],[1]]
[1,2,3]

But no arrows. :( If it can be done with arrows, then hopefully this answer can give you someplace to start.

Also, `errror "it's ambiguous"` is a pretty crappy way to handle lists of the same length. Meh. This question is designed specifically for case where you have a lot of short lists, and one really long one. — Dan Burton, Oct 11 '11 at 19:00

score 3 · Answer 2 · answered Oct 11 '11 at 19:05

Here's the most straightforward implementation I could think of. No arrows involved, though.

I keep a list of pairs where the first element is the original list, and the second is the remaining tail. If we only have one list left, we're done. Otherwise we try taking the tail of all the remaining lists, filtering out those who are empty. If some still remain, keep going. Otherwise, they are all the same length and we arbitrarily pick the first one.

longest []  = error "longest: empty list"
longest xss = go [(xs, xs) | xs <- xss]
  where go [(xs, _)] = xs
        go xss | null xss' = fst . head $ xss
               | otherwise = go xss'
               where xss' = [(xs, ys) | (xs, (_:ys)) <- xss]

I _could_ change the second line to `longest xss = go $ map (id &&& id) xss`, but I guess that's not the kind of arrow usage you were thinking of :) — hammar, Oct 11 '11 at 19:29

score 3 · Accepted Answer · answered Oct 11 '11 at 20:59

3

Thinking about this some more, there is a far simpler solution which gives the same performance characteristics. We can just use maximumBy with a lazy length comparison function:

compareLength [] [] = EQ
compareLength _  [] = GT
compareLength [] _  = LT
compareLength (_:xs) (_:ys) = compareLength xs ys

longest = maximumBy compareLength

answered Oct 11 '11 at 20:59

hammar

138,522
17
304
385

+1 That is indeed very elegant. However, if I'm not mistaken, it re-traverses the lists each time it compares 2 lists to see which is longer. – Dan Burton Oct 11 '11 at 21:02
@DanBurton: Yes, but only as far as the shorter of the two lists. So it's just a question of which approach has the highest constant factors, which I must admit I have not tested. – hammar Oct 11 '11 at 21:03
Alternatively you could use an algebraic data type for binary numbers to represent the length of a list. This way you only get a logarithmic factor in the length of a list to compare the lengths of two lists and still get at least some degree of laziness. – Jan Christiansen Oct 11 '11 at 21:43
As side note `longest` is unnecessarily strict, for example, we have `longest ((_|_:_|_):_|_) == _|_`, although the function could already yield a list constructor. While the current implementation of `maximumBy` takes a function that yields an `Ordering` it should rather take a predicate. – Jan Christiansen Oct 11 '11 at 21:53

Comparing list length with arrows

3 Answers3