colly is a web scraping framework written in Go. Import it as https://github.com/gocolly/colly. You will typically use this tag together with the main tag [go].
Questions tagged [go-colly]
63 questions
2
votes
1 answer
Scraping a simple website with colly in golang does not return any data
I'm trying to scrape a simple website that looks like this:
"Name Surname 1 Name Surname 2 Name Surname 3 Name Surname 4"Wrote a simple go code: package main import…

Kin Lu
- 53
- 4
2
votes
1 answer
Gocolly scraping only certain links
While scraping this link enter link description here , i just want to scrape library links, but the code I wrote extracts all the links, I couldn't manage to filter it. (I'm parsing the urls for later use in github…

Enes Alp Aslan
- 21
- 3
2
votes
1 answer
Colly difference between Request.Visit and collector.Visit
I have written a colly script to collect port authority information from a site.
func main() {
// Temp Variables
var tcountry, tport string
// Colly collector
c := colly.NewCollector()
//Ignore the robot.txt
…

CaptV89
- 61
- 1
- 5
2
votes
0 answers
How to scrape an unordered list with go-colly?
I am trying to build a personal scraper of food recipes. I am able to get all other elements but food ingredients that are in unordered list.
Here is a snippet of the page html:
pagehtml
My code so far that doesn't find strong element but prints…

M2R10
- 23
- 5
2
votes
1 answer
How to make gocolly crawl slower
I am using gocolly for harvesting data from my website, the challenge is, gocolly is too aggressive when crawling the URLs. I have added a RandomDelay
Update
Based on the answer I changed
c.Limit(&colly.LimitRule{
RandomDelay: 10 *…

kristian nissen
- 2,809
- 5
- 44
- 68
2
votes
1 answer
Unable to Select an option from the dropdown for web scraping using gocolly\colly
I want to scrape data from the below public website using Golang gocolly/colly -
https://eds.ospi.k12.wa.us/BusDepreciation/default.aspx?pageName=busSearch
For the above website, I want to select all the "School District" options available in the…

Rahul Satal
- 2,107
- 3
- 32
- 53
2
votes
1 answer
Golang concurrent R/W to database
I'm writing some Go software that is responsible for downloading and parsing a large number of JSON files and writing that parsed data to a sqlite database. My current design has 10 go routines simultaneously downloading/parsing these JSONs and…

Arthur Krut
- 23
- 4
1
vote
0 answers
Why does using async mode/queue when parsing with gocolly yield incosistent results?
package main
import (
"fmt"
"strings"
"sync/atomic"
"time"
"github.com/gocolly/colly/v2"
"github.com/gocolly/colly/v2/queue"
)
func main() {
c := colly.NewCollector(
)
c.SetRequestTimeout(time.Minute * 5)
…

Don Draper
- 463
- 7
- 21
1
vote
1 answer
is it possible crawl CSR website using gocolly
Is it possible to crawl CSR(Client Side Render/JS) websites using gocolly? I need to crawl many websites, and for that, I have a titleXpath in the database as follows:
c.OnXML(titleXpath, func(e *colly.XMLElement) {
data = append(data, e.Text)
…

myagmartseren
- 11
- 3
1
vote
1 answer
how to run go colly in parallel mode with depth of 1 and multiple links
i have a go colly project that i use to crawl multiple links that i fetch from a table like below :
func main() {
//db, err := sql.Open("postgres", "postgresql://postgres:postgres@localhost:5432/db?sslmode=disable")
dbutil.Init()
defer…

Farshad
- 1,830
- 6
- 38
- 70
1
vote
0 answers
Scraper Golang how to go to another page by URLs in the struct
I'm doing a golang scraper to get information from this site https://www.allrecipes.com/recipes/17562/dinner/
I want to get :
Name,
URL,
Descriptions,
Ingredients,
Photos,
Directions.
How can I use the links in the struct products URL to send the…

maka
- 39
- 7
1
vote
1 answer
Scrapper colly in headless mode?
Scrapper colly in headless mode?
Hello,
I am new on golang and I have to make a scraper for my school in France.
The site I have to scrape is www.allrecipes.com. On this site, I chose this page https://www.allrecipes.com/recipes/17562/dinner/
On…

maka
- 39
- 7
1
vote
1 answer
Colly - How to get the value of a child attribute?
Here is the sample page I been working on https://www.lazada.vn/-i1701980654-s7563711492.html
Here is the element I want to get (the product title)
...

Chau Loi
- 1,106
- 1
- 14
- 36
1
vote
1 answer
Golang colly crawling error Too Many Requests
I'm trying to scrape some information from Google Trends. But every time that I try to get some data I receive the error Too Many Requests. Other sites are ok.
My code:
func Teste(searchTrend string) {
searchTrend = strings.Trim(searchTrend, "…

Maick Machado
- 45
- 6
1
vote
0 answers
Does the delay parameter in gocolly delay the website visit or the response?
When is the random delay in the colly limiter taking place?
Based on the example code from: http://go-colly.org/docs/examples/random_delay/ I wrote the following:
func main() {
url := "https://httpbin.org/delay/2"
// Instantiate default…

Rikku
- 39
- 3