Questions tagged [go-colly]

colly is a web scraping framework written in Go. Import it as https://github.com/gocolly/colly. You will typically use this tag together with the main tag [go].

63 questions

votes

1 answer

How to add the start of a url to a colly link list

I'm somewhat new to go and am trying to scrape several webpages using colly. Two of the pages have incomplete links, the below is the code and output func PaloNet() { c := colly.NewCollector( …

go web-scraping go-colly

asked Nov 22 '22 at 10:55

GluttonousCrown

votes

0 answers

Why is string not written in destination file using go colly?

I have a web scraper and I need to write a string from HTML code to my CSV file. The HTML code looks like this:

Bucuresti - Ilfov, Bucuresti,
…</div>
<div class=

go web-scraping go-colly

asked Sep 13 '22 at 09:24

DvdiidI

votes

1 answer

How can I get with go colly some text that is placed inside a div?

I have a web scraper and I'm trying to get some text and write it in a CSV file. The HTML structure is: I have a div with class="css-1nrl4q4"; inside this div I have another div without class, and inside this second div I have two p elements that…

go web-scraping go-colly

asked Sep 01 '22 at 08:46

DvdiidI

votes

1 answer

Scrape discription from web site go-colly

I try scrape the description from website img, but I not understand how to get there My trying pg := Program{} slPG := []Program{} c.OnHTML(".short", func(e *colly.HTMLElement) { pg.Name = e.ChildText("h2.short-cat") pg.Link =…

go go-colly

asked Aug 21 '22 at 09:06

Eno Ron

votes

1 answer

Iterate over HTMLElement attributes with colly?

As seen in the HTML struct, the attributes is a private property: // HTMLElement is the representation of a HTML tag. type HTMLElement struct { // Name is the name of the tag Name string Text string attributes…

go go-colly

asked Jul 27 '22 at 00:39

danthegoodman

votes

1 answer

Running Colly web scraper periodically using cron in Go

I was doing some web scraping using colly but wanted to run it periodically using cron. I did try out a basic approach to it. type scraper struct { coll *colly.Collector rc *redis.Client } func newScraper(c *colly.Collector, rc…

go web-scraping cron go-colly

asked Nov 13 '21 at 06:06

Adith Dev Reddy

votes

1 answer

Golang Colly Scraping - Website Captcha Catches My Scrape

I did make Scraping for Amazon Product Titles but Amazon captcha catches my scraper. I tried 10 times- go run main.go(8 times catches me - 2 times I scraped the product title) I researched this but I did not find any solution for golang(there is…

go web-scraping reverse-proxy go-colly

asked Jun 25 '21 at 13:05

Melisa

votes

1 answer

Retry request in go-colly

I have this scraper library, I would like to change my user agent if the first user agent returns error, but this code doesnt work, if first user agent doesnt work, I have send the 2nd attempt but this will never finish since onHTML is not…

go go-colly

asked May 23 '21 at 07:48

nanakondor

votes

1 answer

Colly not finding the body tag by xpath but finding it by selector name

I'm learning web scraping using gocolly. When I try to find the tag using selector name body, it successfully finds it. However, when I try to find the body tag by xpath /html/body, it fails to find it. I have used OnHTML() with a simple callback…

go web-scraping go-colly

asked Apr 16 '21 at 05:39

kkin

votes

0 answers

Why is Go Colly Collector not always finding SVG tag?

I am trying to write a simple web scraper in Go using Colly. The program is supposed to visit an earnings calendar for a particular date range on yahoo finance and then spiral out and visit each Stock Ticker page that shows up in the list. The…

go web-crawler go-colly

asked Mar 06 '21 at 05:15

jpeeling13

votes

1 answer

problems with noscript when scraping using go-colly

so I'm making a scraping script from a website. when scraping text is successful, only when scraping the image fails. When I inspect element the code is still normal, but when I run the view source the image wrapping code changes to noscript. So I…

go go-colly

asked Feb 20 '21 at 03:41

sultan achmad

votes

1 answer

Colly Max Depth and encoding/json - null

I have gone through the Go tour and I'm now going through some of the Colly tutorials. I understand the max depth and have been trying to implement it in a go program like so: package main import ( "encoding/json" "log" "net/http" …

go go-colly

asked Feb 11 '21 at 01:48

majordomo

1,160
1
15
34

votes

1 answer

Scrape ONLY a certain
using gocolly

I'm trying to make a web scraper using gocolly. I want to ONLY scrape a

element with the id of dailyText on https://wol.jw.org/en/wol/h/r1/lp-e. How can I do this?

go go-colly

asked Jan 30 '21 at 18:48

altude

votes

1 answer

How to bypass re-captcha with gocolly twocaptcha and selenium

After several request my scraping code blocked by target site with re-captcha. I use https://github.com/gocolly/twocaptcha to bypass captcha with selenium chrome driver. It works while bypass with selenium chrome driver but when I run my scraping…

selenium go selenium-chromedriver go-colly

asked Aug 12 '20 at 17:47

dikutandi

votes

1 answer

How to hook go-colly to elasticsearch?

What change do I make in below code to index in elastic using go-colly? I want to get full text (strip html, strip js, render if needed), then Conform it to an avro schema {pageurl: , title:, content:}, Bulk-post to specific elastic-search…

go elasticsearch go-colly

asked May 07 '20 at 02:01

Espresso

5,378
4
35
66

Prev 1 2 3

5 Next