I'm trying to make a web scraper using gocolly. I want to ONLY scrape a <div>
element with the id of dailyText
on https://wol.jw.org/en/wol/h/r1/lp-e
. How can I do this?
Scrape ONLY a certain using gocolly
Asked
Active
Viewed 826 times
0
altude
- 41
- 6
-
1
Check out [this example](https://github.com/gocolly/colly/blob/master/_examples/basic/basic.go) and replace the selector in `c.OnHTML("a[href]"...` with `div#dailyText`, then adapt the function accordingly. If it's not entirely clear feel free to ask more questions or check out the [other examples](https://github.com/gocolly/colly/tree/master/_examples)
– xarantolus
Jan 31 '21 at 19:37
1 Answers
0
Thanks to xarantolus for this answer.
This worked great for me (if the domain allowed me to use it, that is.)
func main() {
cly := colly.NewCollector(
colly.AllowedDomains("https://yourpage.site"),
)
cly.OnHTML("body", func(e *colly.HTMLElement) {
link := e.Attr("div")
fmt.Printf("Link found: %q -> %s\n", e.Text, link)
cly.Visit(e.Request.AbsoluteURL(link))
})
cly.OnRequest(func(r *colly.Request) {
fmt.Println("Visiting", r.URL.String())
})
page := cly.Visit("https://yourpage.site")
fmt.Print(page)
}
altude
- 41
- 6
Asked
Active
Viewed 826 times
0

altude
- 41
- 6
-
1Check out [this example](https://github.com/gocolly/colly/blob/master/_examples/basic/basic.go) and replace the selector in `c.OnHTML("a[href]"...` with `div#dailyText`, then adapt the function accordingly. If it's not entirely clear feel free to ask more questions or check out the [other examples](https://github.com/gocolly/colly/tree/master/_examples) – xarantolus Jan 31 '21 at 19:37
1 Answers
0
Thanks to xarantolus for this answer.
This worked great for me (if the domain allowed me to use it, that is.)
func main() {
cly := colly.NewCollector(
colly.AllowedDomains("https://yourpage.site"),
)
cly.OnHTML("body", func(e *colly.HTMLElement) {
link := e.Attr("div")
fmt.Printf("Link found: %q -> %s\n", e.Text, link)
cly.Visit(e.Request.AbsoluteURL(link))
})
cly.OnRequest(func(r *colly.Request) {
fmt.Println("Visiting", r.URL.String())
})
page := cly.Visit("https://yourpage.site")
fmt.Print(page)
}

altude
- 41
- 6