Can't scrape particular websites using Scrapinghub

Asked Sep 22 '14 at 02:30

Active Sep 22 '14 at 02:30

Viewed 175 times

I am using the autoscraping feature in the scrapinghub service.

While building and deploying the autoscraper, I found that the site I wanted to scrape would never return any Requests, and would time out around 3.5 minutes.

So, I began reading the documentation to see if I could figure out why this was happening (How to check if site is suitable for autoscraping).

I followed the steps and temporarily removed Javascript from my browser (chrome) and found that I had no problems viewing the site I wanted to scrape.

My question is, at risk of sounding vague, what might be some other reasons that a site is not scrapeable, aside from Javascript? Are there some other ideas regarding how to diagnose a problem like this?

asked Sep 22 '14 at 02:30

tumultous_rooster

12,150
32
92
149

Hello, could you please share the link to the site you're trying to scrape? – Oleg Tarasenko Sep 22 '14 at 12:09
Hi Oleg, I recognize you from the tutorials on facebook! :) The URL is here: http://goo.gl/tk5qb – tumultous_rooster Sep 22 '14 at 18:04

Can't scrape particular websites using Scrapinghub

0 Answers0