3

I have a xml (stored in the variable report) which looks like this:

<wrapper>
    <Sample Id="SomeId1">
        <Tag Id="SomeTag">
          <Lane Id="1">
           [...]
          </Lane>
        </Tag>
    </Sample>
    <Sample Id="SomeId2">
        <Tag Id="SomeTag">
          <Lane Id="1">
           [...]
          </Lane>
        </Tag>
    </Sample>
</wrapper>

I want to extract the "Id" attribute from the "Sample" node. I read following article http://www.codecommit.com/blog/scala/working-with-scalas-xml-support and then tried with:

(report \\ "Sample" \ "@Id").text

I got an empty string as return:

scala> (report \\ "Sample" \ "@Id").text
res16: String = ""

But I should have "SomeId1SomeId2" as the return . What have I done wrong?

I found several questions which are similar to mine. Example: Scala: XML Attribute parsing

Community
  • 1
  • 1
Johan
  • 689
  • 7
  • 17
  • What is exact problem? I've just tried your code and successfully got `String = SomeId` – om-nom-nom Aug 01 '12 at 12:06
  • For me it returns an empty string: scala> (report \\ "Sample" \ "@Id").text res0: String = "" – Johan Aug 01 '12 at 12:09
  • IMO, you definetely doing the right thing, but there could be some typo or a different case, e.g. "id" instead of "Id" or something like that, check again, please. – om-nom-nom Aug 01 '12 at 12:13
  • Ok. I'm at a loss here, because now that did work... (must have been a typo as you suggested). However, now I have updated the question to reflect the problem as it looks now. – Johan Aug 01 '12 at 12:38

5 Answers5

4

I got it working like this:

(xml \\ "Sample").map(n => (n \ "@Id").text)

=> scala.collection.immutable.Seq[String] = List(SomeId1, SomeId2)

but there must be a better solution…

Jean-Philippe Pellet
  • 59,296
  • 21
  • 173
  • 234
  • It does seem a bit tricky, but at least it works. Thank you! :) – Johan Aug 01 '12 at 12:51
  • 1
    I think this is the correct way, although the examples in *Programming in Scala* use for-expressions, i.e. `for (s <- report \\ "Sample") yield (s \ "@Id").text`. This makes it a bit easier if you need to dig through the layers and add filters. – Luigi Plinge Aug 01 '12 at 17:30
  • There is a better way - using [Scales Xml](https://github.com/chris-twiner/scalesXml) , see [this gist](https://gist.github.com/3240166) for an example solution for this question – Chris Aug 02 '12 at 20:00
2

If you use the \ selector to pick out an attribute on a NodeSeq with more than one element, you'll get an empty result, as you can see from the source:

def \(that: String): NodeSeq = {
  ...
  that match {
    case "" => fail
    case "_" => makeSeq(!_.isAtom)
    case _ if (that(0) == '@' && this.length == 1) => atResult
    case _ => makeSeq(_.label == that)
  }
}

I've wondered about this before, and if I remember correctly I wasn't able to determine that this is documented behavior—I definitely can't find documentation at the moment.

The current implementation at any rate feels like a hack, and leads to some bizarre behavior:

scala> val bar = <bar>{ <baz/>.copy(label = "@baz") }</bar>
bar: scala.xml.Elem = <bar><@baz></@baz></bar>

scala> <foo>{ bar }</foo> \\ "bar" \ "@baz"
res0: scala.xml.NodeSeq = NodeSeq()

scala> <foo>{ bar ++ bar }</foo> \\ "bar" \ "@baz"
res1: scala.xml.NodeSeq = NodeSeq(<@baz></@baz>, <@baz></@baz>)

It's a perverse example, but the result is still pretty horrifying.

As a workaround, I'd personally write something like (report \\ "Sample").flatMap(_ \"@Id") to get a NodeSeq of the attribute text elements, and then map text over that if I needed to.

Travis Brown
  • 138,631
  • 12
  • 375
  • 680
1

Experimenting some more to this I found an alternative solution to the one provided by @Jean-Philippe Pellet, which I think is slightly more clear (even if I'm sure that there are even better ways to do this.)

report.\\("Sample").foreach(s => println(s.attribute("Id").get.text))

This will return this:

scala> report.\\("Sample").foreach(s => println(s.attribute("Id").get.text))
SomeId1
SomeId2

Since the \ method returns a NodeSeq, one can iterate over each Node and get its attributes and do something with it. In this case just getting them and transforming them to String to print them, but I guess this would allow for more complex operations as well.

Johan
  • 689
  • 7
  • 17
0

The following:

(report \ "Sample").head \ "@Id"

results in a NodeSeq containing your attribute. Entering attribute seems to require a single node (unfortunately, I found no documentation on that assumption - reference links welcome)

Skyr
  • 980
  • 7
  • 12
  • This only gets the attribute for the first of the two Sample nodes, though. Otherwise a good solution. The reason that one cannot get a attribute from a NodeSeq is that it does not have such a method: http://www.scala-lang.org/api/current/scala/xml/NodeSeq.html, which I guess is reasonable since only the nodes would actually have them. But still, some sort of method to fetch all attributes in the NodeSeq would be nice. – Johan Aug 01 '12 at 15:16
  • Ah ok, sorry, misread the question. Somehow I thought you were interested in the first attribute only. I just re-read [this blog post on Scala XML support](http://www.codecommit.com/blog/scala/working-with-scalas-xml-support), which was written for Scala 2.7. The example given there ns \\ "bar" \ "@id" is exactly what you tried at first - and on my current scala installation, it doesn't work, too :-) So is this a change of behaviour? Or a Scala bug? – Skyr Aug 02 '12 at 08:11
-3

You have to use [] instead of \

\\"Sample"[@Id]
Phebus40
  • 173
  • 13
  • No, scala DSL doesn't work like that (square brackets are used only for types) and you'll get `error: ';' expected but '[' found.`, cause scala can't parse it – om-nom-nom Aug 01 '12 at 12:01