9

Given this data type

data Val = X Int | Y Bool | Z Double deriving (Eq, Show)

and a list such as

let vals = [X 1, Z 2.7, Y True, X 2, Z 3.14, Y True]

how to group elements in vals into this list,

[[X 1,X 2],[Y True,Y True],[Z 2.7, Z 3.14]]
elm
  • 20,117
  • 14
  • 67
  • 113

3 Answers3

14

To add to @RamonSnir's answer, the function for grouping a data type by constructors can be also constructed automatically using the "Scrap your boilerplate" framework:

{-# LANGUAGE DeriveDataTypeable #-}

import Data.Data
import Data.Function (on)
import Data.List (groupBy, sort)

data Val = X Int | Y Bool | Z Double
  deriving (Eq, Ord, Show, Typeable, Data)

vals :: [Val]
vals = [X 1, Z 2.7, Y True, X 2, Z 3.14, Y True]

main :: IO ()
main = print $ groupBy (on (==) toConstr) $ sort vals

The two important parts are:

  • derive Typeable and Data, and
  • use toConstr to get the representation of the constructor used in a particular value.
Petr
  • 62,528
  • 13
  • 153
  • 317
8

I've the following:

data Val = X Int | Y Bool | Z Double deriving (Eq, Ord, Show)

vals :: [Val]
vals = [X 1, Z 2.7, Y True, X 2, Z 3.14, Y True]

valCtorEq :: Val -> Val -> Bool
valCtorEq (X _) (X _) = True
valCtorEq (Y _) (Y _) = True
valCtorEq (Z _) (Z _) = True
valCtorEq _ _ = False

And then:

*Main Data.List> groupBy valCtorEq $ sort vals
[[X 1,X 2],[Y True,Y True],[Z 2.7,Z 3.14]]
Ramon Snir
  • 7,520
  • 3
  • 43
  • 61
  • Unfortunately this is not what the OP asked for. The aim is to group values by their constructors, not by equality. – Petr Oct 22 '14 at 06:33
  • 1
    @PetrPudlák you are very correct, I used the wrong file and didn't read the output... I'll edit. – Ramon Snir Oct 22 '14 at 06:36
1

(This is probably extreme overkill, but it was a fun question to tinker around with!)

The provided answers suffer from 3 slight problems:

  • What if the type under consideration isn't in Ord (because for example, there's a function in there somewhere)?
  • Also, should this operation be O(n log n) in the length of the list?
  • Finally, the example provided isn't sufficient to determine whether the grouping should be stable, that is: should the result of grouping [X 2, X 1] be [X 1, X 2] (that's what you get if you use the sort-based solutions) or should the elements be kept in their original order?

So here is the most general solution I could come up with. It's stable, it doesn't need Ord (in fact you don't even need to touch the original datatype) and it runs in about O(n * min(n,W)) time where W is the word size of your machine (on mine, it's 64). That is, it's linear once the list gets longer than 64-ish elements (I say 'about', because the grouped elements still need to be reconstituted from the difference lists).

{-# LANGUAGE DeriveDataTypeable, StandaloneDeriving #-} 

import Data.Data
import qualified Data.IntMap as IM

groupByConstructor :: Data a => [a] -> [[a]]
groupByConstructor = map ($ []) . IM.elems . IM.fromListWith (flip (.)) 
    . map (\a -> (constrIndexOf a, (a:))) where constrIndexOf = constrIndex . toConstr

-- definition of Val as originally posed, without Ord:
data Val = X Int | Y Bool | Z Double deriving (Eq, Show)

deriving instance Typeable Val
deriving instance Data Val

-- new example:
vals = [X 2, Z 2.7, Y True, X 1, Z 3.14, Y False, Z 0.2]

and now groupByConstructor vals gives [[X 2, X 1],[Y True, Y False],[Z 2.7, Z 3.14, Z 0.2]] as I think it should.


It doesn't work for sorting lists of Ints, Chars, Floats, or non-representable types such as Ptr and Array. It could probably be made a bit more efficient by using an algorithm which uses the number of possible constructors to push the linear constant down further but IntMap will have to do for now :-)

yatima2975
  • 6,580
  • 21
  • 42