3

I've been parsing a proprietary file format that has sections and each section has a number of records. The sections can be in any order and the records can be in any order. The order is not significant. While sections should not be duplicated, I can't guarantee that.

I've been using parboiled2 to generate the AST using a format like the following:

oneOrMore( Section1 | Section2 | Section3 )

Where every section generates a case class. They don't inherit from anything resulting in Seq[Any]

These section case classes also contain a Seq[T] of records specific to the section type.

I would like to transform the Seq[Any] into a

case class (section1:Seq[T1], section2:Seq[T2], section3:Seq[T3] )

Does someone have a clever and easy to read technique for that or should I make some mutable collections and use a foreach with a match?

I always feel like I am missing some Scala magic when I fall back to a foreach with vars.

EDIT 1: It was brought up that I should extend a common base class, it is true that I could. But I don't see what that changes about the solution if I still have to use match to identify the type. I want to separate out the different case class types, for instance below I want to collect all the B's, C's, E's, and F's together into a Seq[B], Seq[C], Seq[E], and Seq[F]

 class A()
 case class B(v:Int) extends A
 case class C(v:String) extends A

 case class E(v:Int)
 case class F(v:String)

 val a:Seq[A] = B(1) :: C("2") :: Nil
 val d:Seq[Any] = E(3) :: F("4") :: Nil

 a.head match {
   case B(v) => v should equal (1)
   case _ => fail()
 }

 a.last match {
   case C(v) => v should equal ("2")
   case _ => fail()
 }

 d.head match {
   case E(v) => v should equal (3)
   case _ => fail()
 }

 d.last match {
   case F(v) => v should equal ("4")
   case _ => fail()
 }

EDIT 2: Folding solution

  case class E(v:Int)
  case class F(v:String)


  val d:Seq[Any] = E(3) :: F("4") :: Nil

  val Ts = d.foldLeft((Seq[E](), Seq[F]()))(
    (c,r) => r match {
      case e:E => c.copy(_1=c._1 :+ e)
      case e:F => c.copy(_2=c._2 :+ e)
    }
  )

  Ts should equal ( (E(3) :: Nil,  F("4") :: Nil) )

EDIT 3: Exhaustivity

  sealed trait A //sealed is important
  case class E(v:Int) extends A
  case class F(v:String) extends A


  val d:Seq[Any] = E(3) :: F("4") :: Nil

  val Ts = d.foldLeft((Seq[E](), Seq[F]()))(
    (c,r) => r match {
      case e:E => c.copy(_1=c._1 :+ e)
      case e:F => c.copy(_2=c._2 :+ e)
    }
  )

  Ts should equal ( (E(3) :: Nil,  F("4") :: Nil) )
Thomas
  • 364
  • 1
  • 4
  • 13
  • 3
    Why not have your section types be case classes extending a common base type? Then you can parse into that base type and separate without worrying about having to recover type safety. – Travis Brown Mar 08 '16 at 19:14
  • I am not sure how that changes what I need to do, see my edit above. – Thomas Mar 10 '16 at 18:13
  • 1
    In your update, `a.collect { case b @ B(_) => b }` will return a `Seq[B]`, you could fold into a `(Seq[B], Seq[C])` and get exhaustivity checking, etc. – Travis Brown Mar 10 '16 at 18:35
  • …or if you want something more convenient but still type-safe, here's a [blog post](https://meta.plasm.us/posts/2014/06/14/partitioning-by-constructor/) I wrote a couple of years ago. – Travis Brown Mar 10 '16 at 18:36
  • Travis, thanks for that blog post. It does look like an interesting option, though we currently don't use shapeless, so to bring it in for this case will take some consideration. You did also mention folding, I had glanced at it but not thought about how to make tuples work. Your comment lead me to look at it more in depth. I have updated my question to include where that lead me, is that what you were thinking or could it be improved further? – Thomas Mar 15 '16 at 16:47
  • Yes, that's the idea I had in mind, but if you had a common superclass for `E` and `F` you'd get exhaustivity checking in the match, which is really nice. – Travis Brown Mar 15 '16 at 17:44
  • I tried with and without the case class inheriting and having a non-exhaustive list (left out F in the match). Neither appeared to result in a compiler warning but both would give a runtime match error. – Thomas Mar 15 '16 at 20:19
  • Oh, you need to use the `case E(_) =>` syntax. In general `case e: E =>` in Scala is dangerous / broken / a bad idea. – Travis Brown Mar 15 '16 at 20:21
  • I figured out that the base trait/class needs to be sealed for the exhaustivity warning. That in itself does justify inheriting from a common point. Many thanks for your input! – Thomas Mar 15 '16 at 23:27

1 Answers1

0

While this could be done with shapeless to make a solution that is more terse (As Travis pointed out) I chose to go with a pure Scala solution based on Travis' feedback.

Here is an example of using foldLeft to manipulate a tuple housing strongly typed Seq[]. Unfortunately every type that is possible requires a case in the match which can become tedious if there are many types.

Also note, that if the base class is sealed, then the match will give an exhaustivity warning in the event a type was missed making this operation type safe.

  sealed trait A //sealed is important
  case class E(v:Int) extends A
  case class F(v:String) extends A


  val d:Seq[A] = E(3) :: F("4") :: Nil

  val Ts = d.foldLeft((Seq[E](), Seq[F]()))(
    (c,r) => r match {
      case e:E => c.copy(_1=c._1 :+ e)
      case e:F => c.copy(_2=c._2 :+ e)
    }
  )

  Ts should equal ( (E(3) :: Nil,  F("4") :: Nil) )
Community
  • 1
  • 1
Thomas
  • 364
  • 1
  • 4
  • 13