8

Suppose I used language-javascript library to build AST in Haskell. The AST has nodes of different types, and each node can have fields of those different types. And each type can have numerous constructors. (All the types instantiate Data, Eq and Show).

I would like to count each type's constructor occurrence in the tree. I could use toConstr to get the constructor, and ideally I'd make a Tree -> [Constr] function fisrt (then counting is easy).

There are different ways to do that. Obviously pattern matching is too verbose (imagine around 3 types with 9-28 constructors).

So I'd like to use a generic traversal, and I tried to find the solution in SYB library.

  1. There is an everywhere function, which doesn't suit my needs since I don't need a Tree -> Tree transformation.
  2. There is gmapQ, which seems suitable in terms of its type, but as it turns out it's not recursive.
  3. The most viable option so far is everywhereM. It still does the useless transformation, but I can use a Writer to collect toConstr results. Still, this way doesn't really feel right.

Is there an alternative that will not perform a useless (for this task) transformation and still deliver the list of constructors? (The order of their appearance in the tree doesn't matter for now)

Ilya Chernov
  • 411
  • 7
  • 18

2 Answers2

5

Not sure if it's the simplest, but:

> data T = L | B T T deriving Data
> everything (++) (const [] `extQ` (\x -> [toConstr (x::T)])) (B L (B (B L L) L))
[B,L,B,B,L,L,L]

Here ++ says how to combine the results from subterms.

const [] is the base case for subterms who are not of type T. For those of type T, instead, we apply \x -> [toConstr (x::T)].

If you have multiple tree types, you'll need to extend the query using

const [] `extQ` (handleType1) `extQ` (handleType2) `extQ` ...

This is needed to identify the types for which we want to take the constructors. If there are a lot of types, probably this can be made shorter in some way.

Note that the code above is not very efficient on large trees since using ++ in this way can lead to quadratic complexity. It would be better, performance wise, to return a Data.Map.Map Constr Int. (Even if we do need to define some Ord Constr for that)

chi
  • 111,837
  • 3
  • 133
  • 218
  • Thank you for the answer! I'll look into that and using a Map instead of a list. However, in your example the B constructor has only two fields. In my case, numerous constructors have different amount of fields. Does it matter in this case? – Ilya Chernov Aug 03 '19 at 07:09
  • Also, could you please specify, which module's `everything` do you mean exactly? I tried a couple, neither have the type that matches with your example. – Ilya Chernov Aug 03 '19 at 12:02
  • UPD: Looks like `Data.Generics.Schemes`, is that correct? – Ilya Chernov Aug 03 '19 at 12:08
  • 1
    @IlyaChernov Correct. – chi Aug 03 '19 at 12:15
  • Alright, I tested this approach against all the use cases I've got so far and it works perfectly. Thank you very much! Next I'll try to employ `Map` instead of lists, as you suggest – Ilya Chernov Aug 03 '19 at 12:25
4

universe from the Data.Generics.Uniplate.Data module can give you a list of all the sub-trees of the same type. So using Ilya's example:

data T = L | B T T deriving (Data, Show)

tree :: T
tree = B L (B (B L L) L)
λ> import Data.Generics.Uniplate.Data
λ> universe tree
[B L (B (B L L) L),L,B (B L L) L,B L L,L,L,L]
λ> fmap toConstr $ universe tree
[B,L,B,B,L,L,L]
soupi
  • 1,013
  • 6
  • 6
  • This alternative is, of course, more concise. But will it work in my example, given that there are nodes of different types in my AST? – Ilya Chernov Aug 03 '19 at 16:48
  • If I understand what you want correctly, [`universeBi`](https://hackage.haskell.org/package/uniplate-1.6.12/docs/Data-Generics-Uniplate-Operations.html#v:universeBi) could be used to get all values of a specific type inside a different type. – soupi Aug 03 '19 at 18:58
  • I think it is easier to demonstrate what I wanted to do with my current implementation of it: https://github.com/ch3rn0v/dry/commit/8f4ea0bcc14087adbc9aa2c2223de437fcb5553b (particularly `countConstructorOccurrences`) – Ilya Chernov Aug 03 '19 at 20:17