0

I am currently processing big XML files using Scala 3. I wish every node (XML element) to report automatically if not all their attributes and children were read. I have developed a solution, but feel it's not very elegant and maybe even kind of abusing the original purpose of scala.util.Using. Not sure, though.

object ReportingNode:
  implicit val releasable: Releasable[ReportingNode] = node =>
    node.assumeNoOtherAttributes()
    node.assumeNoOtherChildren()

class ReportingNode(node: Node):
  private val description = (node.label + " " + node \@ "OID").trim
  private var accessedAttributes = Set[String]()
  private var accessedChildren = Set[String]()

  @targetName("attribute")
  def \@(attributeName: String): String =
    accessedAttributes += attributeName
    node \@ attributeName

  @targetName("children")
  def \(childName: String): NodeSeq =
    accessedChildren += childName
    node \ childName

  private def assumeNoOtherAttributes(): Unit =
    for (attribute <- node.attributes)
      if !accessedAttributes.contains(attribute.key) then
        println(s"WARN $description has unexpected attribute ${attribute.key}")

  private def assumeNoOtherChildren(): Unit =
    for (child <- node.child)
      if !(accessedChildren + "#PCDATA").contains(child.label) then
        println(s"WARN $description has unexpected child ${child.label}")

Usage for something like <foo oid="42" name="bob"><bar...:

case class Foo(context: String, name: String, bars: Seq[Bar])

object Foo:
  def apply(n: Node): Foo = Using.resource(ReportingNode(n))(node =>
    val oid = node \@ "oid"
    val name = node \@ "name"
    val bars = (node \ "bar").map(Bar.apply)
    Foo(oid, name, bars)
  )

Any idea how to do such a thing nicely? Or is this already the best it gets?

Some context: You may ask why I am translating the nodes to a custom structure in the first place. Well, here I am simplifying a little. The XML is quite complex (CDISC ODM plus extensions) and my code is resolving references to definitions, etc.

Marcus
  • 1,857
  • 4
  • 22
  • 44

1 Answers1

0

Well, at least the usage looks clean now:

object Foo:
  def apply(n: Node): Foo = ReportingNode(n)(node =>
    val oid = node \@ "oid"
    val name = node \@ "name"
    val bars = (node \ "bar").map(Bar.apply)
    Foo(oid, name, bars)
  )

I achieved this by moving the Using stuff to an apply def in the ReportingNode object:

object ReportingNode:
  implicit val releasable: Releasable[ReportingNode] = node =>
    node.assumeNoOtherAttributes()
    node.assumeNoOtherChildren()
  def apply[A](node: Node)(job: ReportingNode => A): A =
    Using.resource[ReportingNode, A](new ReportingNode(node))(job)

The classes remain unchanged.

Marcus
  • 1,857
  • 4
  • 22
  • 44