Getting text from a node

Question

I have a piece of HTML like this:

<a href="/something">
     Title
    <span>Author</span>
</a>

I got a WebElement that matches this HTML. How can I extract only "Title" from it? Method .getText() returns "Title\nAuthor"...

score 7 · Accepted Answer · answered Dec 14 '11 at 14:53

7

You can't do this in the WebDriver API, you have to do it in your code. For example:

var textOfA = theAElement.getText();
var textOfSpan = theSpanElement.getText();
var text = textOfA.substr(0, textOfA.length - textOfSpan.length).trim('\n');

Note that the trailing newline is actually part of the text of the <a> element, so if you don't want it, you need to strip it.

answered Dec 14 '11 at 14:53

Ross Patterson

9,527
33
48

1

I did it like this, eventually. Hoped I could get it through the API/XPath/Whatever... – zorglub76 Dec 14 '11 at 15:41

score 0 · Answer 2 · answered Mar 28 '19 at 00:03

Here is the method developed in python.

def get_text_exclude_children(element):
    return driver.execute_script(
        """
        var parent = arguments[0];
        var child = parent.firstChild;
        var textValue = "";
        while(child) {
            if (child.nodeType === Node.TEXT_NODE)
                textValue += child.textContent;
                child = child.nextSibling;
        }
        return textValue;""",
        element).strip()

How to use in this:

liElement = driver.find_element_by_xpath("//a[@href='your_href_goes_here']")
liOnlyText = get_text_exclude_children(liElement)
print(liOnlyText)

Please use your possible strategy to get the element, this method need an element from which you need the text (without children text).

score 0 · Answer 3 · answered Jul 25 '20 at 15:48

0

If using Python:

[x['textContent'].strip() for x in element.get_property('childNodes') if isinstance(x, dict)]

Where element is your element.

This will return ['Title', ''] (because there are spaces after span).

answered Jul 25 '20 at 15:48

Vftdan

1
1

score 0 · Answer 4 · edited Jul 02 '21 at 06:05

you can use jsexecutor to iterate the child nodes, trap the textNode 'Title' and then return its content like below

WebElement link = driver.findElement(By.xpath("//a[@href='something']"));
JavascriptExecutor js = ((JavascriptExecutor)driver);
String authorText = (String) js.executeScript("for(var i = 0; i < arguments[0].childNodes.length; i++) { 
 if(arguments[0].childNodes[i].nodeName == \"#text\") { return arguments[0].childNodes[i].textContent; } }", link);

The javascript code block above iterates both textNode ('Title') and SPAN ('Author') but returns only the text content of textNode.

Note: Previous to this, I have tried including text node in xpath like below, but webdriver throws invalidselector exception as it requires element not textnode

WebElement link = driver.findElement(By.xpath("//a[@href='something']/text()"));

score 0 · Answer 5 · edited Oct 01 '21 at 14:37

0

Verify the element present for "//a[normalize-space(text())=Title]". It will return true if the text present inside 'a' tag is 'Title'.

edited Oct 01 '21 at 14:37

cruisepandey

28,520
6
20
38

answered Dec 19 '11 at 13:56

GirishB

524
1
3
9

Getting text from a node

5 Answers5

Linked

Related