1

I am trying to modify html code in Groovy. I parsed it using XMLSlurper. The problem is i need to edit text of certain tag which contains text and children tags. Html code looks like this:

<ul><li>Text to modify<span>more text</span></li></ul>

In groovy i am trying this code:

def ulDOM = new XmlSlurper().parseText(ul);
def elements = ulDOM.li.findAll{
    it.text().equals("text i am looking for");
}

The problem is i got empty array in 'elements' because it.text() returns text from 'it' node together with whole DOM subtree text nodes. In this case "Text to modifymore text". Note that contains() method is not enough for my solution.

My question is how to get exact text from a certain tag and not the text from whole DOM subtree?

Jayan
  • 18,003
  • 15
  • 89
  • 143
Bazyl
  • 68
  • 1
  • 5

1 Answers1

1

.text() evaluate children and appends. Hence it will always include merged line.

Could you consinder localText()? Not exactly what you expect, it returns an array of strings.

import org.testng.Assert

ul='''<ul>
          <li>Text to modify<span>more text</span>
          </li>
       </ul> '''

def ulDOM = new XmlSlurper().parseText(ul);


def elements = ulDOM.li.findAll{
    String[] text = it.localText();
    text[0].equals("Text to modify");
}
Assert.assertTrue(elements.size()==1)
Jayan
  • 18,003
  • 15
  • 89
  • 143
  • Thanks. Method localText() was exactly what i was looking for. The question is why i couldn find it in documentation [here](http://groovy.codehaus.org/api/groovy/util/slurpersupport/GPathResult.html) ? – Bazyl Nov 20 '14 at 12:55
  • 1
    Anyway i already change library for html parsing. I used JSoup and did it in 5 mins... so i recomend it for anyone that bothers html modification in groovy. – Bazyl Nov 20 '14 at 13:03
  • Great. If any one ask for plain html manipulation, first recommendation is always jsoup. I guess my edit made it worse. I removed that tag, thinking this as plain xml edit. – Jayan Nov 20 '14 at 13:15