7

I have a piece of HTML like this:

<a href="/something">
     Title
    <span>Author</span>
</a>

I got a WebElement that matches this HTML. How can I extract only "Title" from it? Method .getText() returns "Title\nAuthor"...

Mathias Müller
  • 22,203
  • 13
  • 58
  • 75
zorglub76
  • 4,852
  • 7
  • 37
  • 46

5 Answers5

7

You can't do this in the WebDriver API, you have to do it in your code. For example:

var textOfA = theAElement.getText();
var textOfSpan = theSpanElement.getText();
var text = textOfA.substr(0, textOfA.length - textOfSpan.length).trim('\n');

Note that the trailing newline is actually part of the text of the <a> element, so if you don't want it, you need to strip it.

Ross Patterson
  • 9,527
  • 33
  • 48
0

Here is the method developed in python.

def get_text_exclude_children(element):
    return driver.execute_script(
        """
        var parent = arguments[0];
        var child = parent.firstChild;
        var textValue = "";
        while(child) {
            if (child.nodeType === Node.TEXT_NODE)
                textValue += child.textContent;
                child = child.nextSibling;
        }
        return textValue;""",
        element).strip()

How to use in this:

liElement = driver.find_element_by_xpath("//a[@href='your_href_goes_here']")
liOnlyText = get_text_exclude_children(liElement)
print(liOnlyText)

Please use your possible strategy to get the element, this method need an element from which you need the text (without children text).

supputuri
  • 13,644
  • 2
  • 21
  • 39
0

If using Python:

[x['textContent'].strip() for x in element.get_property('childNodes') if isinstance(x, dict)]

Where element is your element.

This will return ['Title', ''] (because there are spaces after span).

Vftdan
  • 1
  • 1
0

you can use jsexecutor to iterate the child nodes, trap the textNode 'Title' and then return its content like below

WebElement link = driver.findElement(By.xpath("//a[@href='something']"));
JavascriptExecutor js = ((JavascriptExecutor)driver);
String authorText = (String) js.executeScript("for(var i = 0; i < arguments[0].childNodes.length; i++) { 
 if(arguments[0].childNodes[i].nodeName == \"#text\") { return arguments[0].childNodes[i].textContent; } }", link);

The javascript code block above iterates both textNode ('Title') and SPAN ('Author') but returns only the text content of textNode.

Note: Previous to this, I have tried including text node in xpath like below, but webdriver throws invalidselector exception as it requires element not textnode

WebElement link = driver.findElement(By.xpath("//a[@href='something']/text()"));

Dharman
  • 30,962
  • 25
  • 85
  • 135
Bhuvanesh Mani
  • 1,394
  • 14
  • 23
0

Verify the element present for "//a[normalize-space(text())=Title]". It will return true if the text present inside 'a' tag is 'Title'.

cruisepandey
  • 28,520
  • 6
  • 20
  • 38
GirishB
  • 524
  • 1
  • 3
  • 9