1

I am trying to use the Go library Chromedp to scrape some data from a web page.

I basically need to click on a button, take for example the "Click me" button from the W3C School website. I need to filter that button with the value HTML attribute of the input HTML tag (as there are no specific IDs to target and most of the Chromedp example use selectors based on the ID attribute).

The following code seems to be hanging forever on the initial web page without clicking the button.

  • Why the following code does not click on the button?
  • Is Chromedp using some "standard" xpath filters or what? I think the syntax of selectors in Chromedp is some sort of standard syntax you could find e.g. for Selenium as well, but I can not find a way to know the rules for these selectors. What are the syntax rules to build the filters in the Chromedp selectors?
  • Is there any other source of documentation for Chromedp that is not the source code or the Go docs?
package main

import (
    "context"
    "log"
    "time"

    "github.com/chromedp/chromedp"
)

func main() {
    var err error

    // create context
    ctxt, cancel := context.WithCancel(context.Background())
    defer cancel()

    // create chrome instance
    c, err := chromedp.New(ctxt, chromedp.WithLog(log.Printf))
    if err != nil {
        log.Fatal(err)
    }

    // run task list
    err = c.Run(ctxt, clickStuff())
    if err != nil {
        log.Fatal(err)
    }

    // shutdown chrome
    err = c.Shutdown(ctxt)
    if err != nil {
        log.Fatal(err)
    }

    // wait for chrome to finish
    err = c.Wait()
    if err != nil {
        log.Fatal(err)
    }

    log.Printf("DONE")
}

func clickStuff() chromedp.Tasks {
    return chromedp.Tasks{
        chromedp.Navigate(`https://www.w3schools.com/TAGS/tryit.asp?filename=tryhtml5_input_type_button`),
        chromedp.Click(`input[@value='Click me']`, chromedp.NodeVisible),
        chromedp.Sleep(5 * time.Second),
    }
}

Also when running the piece of code above I see all sorts of logs, but basically I see this bit of log being printed over and over again, which seems to suggest the tag is not there, but it is there indeed and I don't know how to figure out what syntax to use for the selector?

2019/03/23 17:43:01 <- {"id":25,"method":"DOM.performSearch","params":{"query":"input[@value='Click me']"}}
2019/03/23 17:43:01 -> {"id":25,"result":{"searchId":"1000014442.18","resultCount":0}}
Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
TPPZ
  • 4,447
  • 10
  • 61
  • 106

2 Answers2

6

You can select html attributes with the BySearch Selector:

chromedp.Click(`//*[@value="Click me"]`, chromedp.BySearch)
1

I don't know your particular language paradigm but the button is within an iframe. Usually one has to switch to that iframe to access an element or in css you may be able to use a deep combinator.

For the page given this would be

*/deep/[value="Click me"]

A quick google shows that css selector queries can be applied via BySearch

QHarr
  • 83,427
  • 12
  • 54
  • 101