colly is a web scraping framework written in Go. Import it as https://github.com/gocolly/colly. You will typically use this tag together with the main tag [go].
Questions tagged [go-colly]
63 questions
0
votes
1 answer
How to add the start of a url to a colly link list
I'm somewhat new to go and am trying to scrape several webpages using colly. Two of the pages have incomplete links, the below is the code and output
func PaloNet() {
c := colly.NewCollector(
…

GluttonousCrown
- 13
- 2
0
votes
0 answers
Why is string not written in destination file using go colly?
I have a web scraper and I need to write a string from HTML code to my CSV file.
The HTML code looks like this:
0
votes
1 answer
How can I get with go colly some text that is placed inside a div?
I have a web scraper and I'm trying to get some text and write it in a CSV file. The HTML structure is: I have a div with class="css-1nrl4q4"; inside this div I have another div without class, and inside this second div I have two p elements that…

DvdiidI
- 63
- 6
0
votes
1 answer
Scrape discription from web site go-colly
I try scrape the description from website
img, but I not understand how to get there
My trying
pg := Program{}
slPG := []Program{}
c.OnHTML(".short", func(e *colly.HTMLElement) {
pg.Name = e.ChildText("h2.short-cat")
pg.Link =…

Eno Ron
- 11
- 2
0
votes
1 answer
Iterate over HTMLElement attributes with colly?
As seen in the HTML struct, the attributes is a private property:
// HTMLElement is the representation of a HTML tag.
type HTMLElement struct {
// Name is the name of the tag
Name string
Text string
attributes…

danthegoodman
- 501
- 4
- 10
0
votes
1 answer
Running Colly web scraper periodically using cron in Go
I was doing some web scraping using colly but wanted to run it periodically using cron. I did try out a basic approach to it.
type scraper struct {
coll *colly.Collector
rc *redis.Client
}
func newScraper(c *colly.Collector, rc…

Adith Dev Reddy
- 1
- 1
0
votes
1 answer
Golang Colly Scraping - Website Captcha Catches My Scrape
I did make Scraping for Amazon Product Titles but Amazon captcha catches my scraper. I tried 10 times- go run main.go(8 times catches me - 2 times I scraped the product title)
I researched this but I did not find any solution for golang(there is…

Melisa
- 310
- 2
- 16
0
votes
1 answer
Retry request in go-colly
I have this scraper library, I would like to change my user agent if the first user agent returns error, but this code doesnt work, if first user agent doesnt work, I have send the 2nd attempt but this will never finish since onHTML is not…

nanakondor
- 615
- 11
- 25
0
votes
1 answer
Colly not finding the body tag by xpath but finding it by selector name
I'm learning web scraping using gocolly. When I try to find the tag using selector name body, it successfully finds it. However, when I try to find the body tag by xpath /html/body, it fails to find it.
I have used OnHTML() with a simple callback…

kkin
- 33
- 2
- 6
0
votes
0 answers
Why is Go Colly Collector not always finding SVG tag?
I am trying to write a simple web scraper in Go using Colly. The program is supposed to visit an earnings calendar for a particular date range on yahoo finance and then spiral out and visit each Stock Ticker page that shows up in the list. The…

jpeeling13
- 1
- 1
0
votes
1 answer
problems with noscript when scraping using go-colly
so I'm making a scraping script from a website. when scraping text is successful, only when scraping the image fails. When I inspect element the code is still normal, but when I run the view source the image wrapping code changes to noscript. So I…

sultan achmad
- 3
- 2
0
votes
1 answer
Colly Max Depth and encoding/json - null
I have gone through the Go tour and I'm now going through some of the Colly tutorials. I understand the max depth and have been trying to implement it in a go program like so:
package main
import (
"encoding/json"
"log"
"net/http"
…

majordomo
- 1,160
- 1
- 15
- 34
0
votes
1 answer
Scrape ONLY a certain using gocolly
I'm trying to make a web scraper using gocolly. I want to ONLY scrape a element with the id of dailyText on https://wol.jw.org/en/wol/h/r1/lp-e. How can I do this?
altude
- 41
- 6
I'm trying to make a web scraper using gocolly. I want to ONLY scrape a
element with the id of dailyText on https://wol.jw.org/en/wol/h/r1/lp-e. How can I do this?

altude
- 41
- 6
0
votes
1 answer
How to bypass re-captcha with gocolly twocaptcha and selenium
After several request my scraping code blocked by target site with re-captcha. I use https://github.com/gocolly/twocaptcha to bypass captcha with selenium chrome driver. It works while bypass with selenium chrome driver but when I run my scraping…

dikutandi
- 127
- 1
- 6
0
votes
1 answer
How to hook go-colly to elasticsearch?
What change do I make in below code to index in elastic using go-colly?
I want to get full text (strip html, strip js, render if needed), then
Conform it to an avro schema {pageurl: , title:, content:},
Bulk-post to specific elastic-search…

Espresso
- 5,378
- 4
- 35
- 66