0

So I have attempted this already with some code I got online. I think it may have something to do with the previous implementation.

The idea of this app is to dynamically scrape any website, doing the hard work first by literally scraping every element on the page and indexing important information like buttons and there relative xpath

I am having trouble detecting whether or not the element that I am currently iterating through is a shadow root element or not.

public void getListOfElements(List<WebElement> e, String previous) {

    if(e.size() == 0) {
        //exit
        return;
    }

    
    
    for(WebElement elem: e) {
        //checking whether or not the tag that is a shadow root is there
        if(elem.getAttribute("id").equals("wmHostPrimary")) {
            System.out.println("I FOUND THE WMHOSTPRIMARY");
        }

        //above works and prints, so why does the below not work
        WebElement potShadowRoot = getShadowRoot(webDriver, elem);
        if(potShadowRoot != null) {
            //the below never runs I am assuming that potShadowRoot is always null
            System.out.println("Shadow root element found\n\n");
            getListOfElements(potShadowRoot.findElements(By.xpath(".//*")), elem.getTagName());
        } else {
            List<WebElement> webElems = webDriver.findElements(By.xpath(previous + "/" + elem.getTagName() + "/*"));
            getListOfElements(webElems, previous + "/" + elem.getTagName());
        }
        
    }
}

the below function always returns null even though there is a shadow root being found from above checking I have verified

private static WebElement getShadowRoot(WebDriver driver,WebElement shadowHost) {
    JavascriptExecutor js = (JavascriptExecutor) driver;
    try {
        return (WebElement) js.executeScript("return arguments[0].shadowRoot", shadowHost);
    } catch(Exception e) {
        return null;
    }
}

when I call the function I am calling the children of the top layer of the HTML tag.

getListOfElements(webDriver.findElements(By.xpath("/html/*")), "html");

I think that is all you would need for a minimal example. I have WebDriver autowired and this is a springboot app.

if you need more information please let me know appreciate the help.

the HTML may help as well here is a rough design:

<html>
    <body><div id="wmHostPrimary">#shadow-root (open)</div></body>
    ...othertags
</html>

it may also help to know that the following code works:

String str = "return document.querySelector('#wmHostPrimary').shadowRoot.querySelector('body')";
WebElement element =  (WebElement) js.executeScript(str);
devin
  • 368
  • 1
  • 3
  • 19
  • are you using Selenium 3 or 4? Which webdriver? (chrome96 forward changed things a bit and Selenium 4 has a built-in method...) See here: https://groups.google.com/g/chromedriver-users/c/d1aexxUvQGk It's also worth noting that certain shadow dom nodes will return null if they are attached with mode: closed.... in which case you can try "return arguments[0]._root". – pcalkins Oct 25 '22 at 22:50
  • I am actually using geckodriver ( a firefox version) because I am using an m1 MacBook pro for development and chromedriver doesn't have an arm base build for their newest version of webdriver. @pcalkins – devin Oct 25 '22 at 23:10
  • unless I am incorrect about that then I will try with the newest one, previous tries to use the current webdriver failed because my current version of chrome and current version of chromedriver are not the same on m1 – devin Oct 25 '22 at 23:11
  • looks like they changed the naming from _m1, to _arm64... you can also set an environment var to disable version check: "webdriver.chrome.disableBuildCheck", "true" – pcalkins Oct 26 '22 at 17:55
  • in that link https://titusfortner.com/2021/11/22/shadow-dom-selenium.html it using "return arguments[0].shadowRoot.children" for firefox – Raven Oct 28 '22 at 15:09

0 Answers0