I'm already using urllib2 to get the pages using proxy, but it's taking way too long, and I know that using proxy takes longer, but still is taking too long comparing if I test the proxy in firefox or ie.
Thanks.
I am working with a node.js project (using Wikistream as a basis, so not totally my own code) which streams real-time wikipedia edits. The code breaks each edit down into its component parts and stores it as an object (See the gist at…
I've been trying to figure this one out for about a week now and just
can't come up with a good solution. So, I figured I would see if anyone could help me out. Here's one of the links that I'm trying to…
Facebook scraper throws some weird stuff when reading the contents of my page...
Page URL:
http://www.protagora.hr/Stranica/O-nama/9/
Scrape debug output:…
i have a web shop build on prestashop.
an i am trying to integrate the Like button. and i observed that on some pages it scrapes out a thumbnail on some other pages it does not.
i found out the page that shows us exactly what the scraper sees
so the…
I found a php script to scrape company profile pages from linkedin here https://stackoverflow.com/questions/42329819/how-can-i-scrape-linkedin-company-pages-with-curl-and-php-no-csrf-token-found-i#=
I replaced the UserAgenet with my own. it…
I have a LinkedIn scraper (built in Python) already set up which takes a list of company URLs as input, and outputs all the information about that company (such as location, website, and size (number of employees)).
The problem is the input: it…
I am writing a webscraper using selenium on python. I wrote the script to pull information from one site, then go to another and pull different information (emails).
When I run the script with browser = webdriver.Firefox(), the script behaves…
I've been trying to figure out how to webscrape this page: sick.com
I can't figure it out. I've been trying Visual Web Ripper but it doesn't pass the submit form, because it doesn't remember the cookie. Do you have any ideas? Sick.com is ok with me…
Possible Duplicate:
Facebook won’t share a link to my site
I have 2 websites that fail to show an image when pasted into facebook. So I went to the facebook object debugger and compare what the scraper sees to what view source…
Is it possible to scrape the products from a ecommerce site using the anemone and nokogiri libs in ruby?
I understand how to pull the data I need from each product page using nokogiri but I can't figure out how to make anemone/nokogiri crawl the…
I am working on a project where I have inherited some code that logs into a website using python's 'requests' library and scrapes the site for content. The 'login' code utilizes a backend URL to POST credentials to an endpoint. (Works fine)
There is…
I am trying to find the full webpage address for a form generated by a website. The website is https://treasurer.maricopa.gov/Parcel/?Parcel=50427029
Once you get there I want to see the web address for the Redemption Statement. You click on it…