5

I want to write a function that splits lists into sublists according to what items satisfy a given property p. My question is what to call the function. I'll give examples in Haskell, but the same problem would come up in F# or ML.

split :: (a -> Bool) -> [a] -> [[a]]  --- split lists into list of sublists

The sublists, concatenated, are the original list:

concat (split p xss) == xs

Every sublist satisfies the initial_p_only p property, which is to say (A) the sublist begins with an element satisfying p—and is therefore not empty, and (B) no other elements satisfy p:

initial_p_only :: (a -> Bool) -> [a] -> Bool
initial_p_only p [] = False
initial_p_only p (x:xs) = p x && all (not . p) xs

So to be precise about it,

all (initial_p_only p) (split p xss)

If the very first element in the original list does not satisfy p, split fails.

This function needs to be called something other than split. What should I call it??

imz -- Ivan Zakharyaschev
  • 4,921
  • 6
  • 53
  • 104
Norman Ramsey
  • 198,648
  • 61
  • 360
  • 533
  • groupBy doesn't seem quite right. groupAt might work. – Tim Perry Mar 25 '11 at 22:06
  • 4
    I love how you wrote a formal specification without an implementation. – luqui Mar 25 '11 at 22:30
  • Sorry if this comes a bit late, but I believe your representation is wrong. "Make illegal states unrepresentable". What you should return is a list of `(a, [a])`. And your input should probably be `(a, [a])` as well, but for a slightly different reason (beware confusions). If this is a recurrent theme in your problem domain, it may also be worth it to newtype `PAndList` and `PAndNotPList` to differentiate those two types. – gasche Apr 24 '11 at 14:28

2 Answers2

12

I believe the function you're describing is breakBefore from the list-grouping package.

Data.List.Grouping: http://hackage.haskell.org/packages/archive/list-grouping/0.1.1/doc/html/Data-List-Grouping.html

ghci> breakBefore even [3,1,4,1,5,9,2,6,5,3,5,8,9,7,9,3,2,3,8,4,6,2,6]
[[3,1],[4,1,5,9],[2],[6,5,3,5],[8,9,7,9,3],[2,3],[8],[4],[6],[2],[6]]
luqui
  • 59,485
  • 12
  • 145
  • 204
ase
  • 13,231
  • 4
  • 34
  • 46
  • +1 for not only giving a name, but an already-existing implementation of the function. – Dan Burton Mar 25 '11 at 23:43
  • 1
    Terrifying. My code has already been written. Thanks kindly. – Norman Ramsey Mar 26 '11 at 00:24
  • I don't like plain variants of `break` since `break` itself produces a tuple and not a list. Maybe `breaksBefore` except the plural is too subtle. `unfoldBreaks` is a pretty literal name. – sclv Mar 26 '11 at 14:35
  • 1
    `breakBefore` doesn't seem to exactly match your specification because `breakBefore` doesn't require that the first element of the first grouping in the result satisfies the given predicate, but the specification does require this (otherwise the function should fail--that's what the specification says). – imz -- Ivan Zakharyaschev Apr 02 '11 at 00:01
2

I quite like some name based on the term "break" as adamse suggests. There are quite a few possible variants of the function. Here is what I'd expect (based on the naming used in F# libraries).

A function named just breakBefore would take an element before which it should break:

breakBefore :: Eq a => a -> [a] -> [[a]] 

A function with the With suffix would take some kind of function that directly specifies when to break. In case of brekaing this is the function a -> Bool that you wanted:

breakBeforeWith :: (a -> Bool) -> [a] -> [[a]]

You could also imagine a function with By suffix would take a key selector and break when the key changes (which is a bit like group by, but you can have multiple groups with the same key):

breakBeforeBy :: Eq k => (a -> k) -> [a] -> [[a]]

I admit that the names are getting a bit long - and maybe the only function that is really useful is the one you wanted. However, F# libraries seem to be using this pattern quite consistently (e.g. there is sort, sortBy taking key selector and sortWith taking comparer function).

Perhaps it is possible to have these three variants for more of the list processing functions (and it's quite good idea to have some consistent naming pattern for these three types).

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553