1

Is it possible to crawl CSR(Client Side Render/JS) websites using gocolly? I need to crawl many websites, and for that, I have a titleXpath in the database as follows:

c.OnXML(titleXpath, func(e *colly.XMLElement) {
   data = append(data, e.Text)
   fmt.Println("title", e.Text)
})

Yes or no or another package

1 Answers1

2

It is not possible to crawl Client-Side Rendered (CSR/JS) websites using gocolly alone. gocolly is a scraping library for Golang that operates at the HTTP level and can parse static HTML documents, but it does not execute JavaScript.

To scrape CSR websites, you need a headless browser or a web scraping tool that supports JavaScript rendering. Some popular options for scraping CSR websites include:

  • Puppeteer (with the Golang library such as chromedp)
  • Selenium (with the Golang library such as goselenium)
Prashant Luhar
  • 299
  • 5
  • 19