Questions tagged [scraperwiki]

ScraperWiki was an online tool for Screen Scraping.

ScraperWiki ScraperWiki was a platform for writing and scheduling screen scrapers, and for storing the data they generate. It support Ruby, Python and PHP. A later version of the service was called QuickCode, which has also been decommissioned.

"Scraper" refers to screen scrapers, programs that extract data from websites. "Wiki" means that any user with programming experience can create or edit such programs for extracting new data, or for analyzing existing datasets.

68 questions

votes

2 answers

How to scrape more than first instance of triple-nested list of links in Python?

I am trying to determine the simplest way to record the contents of webpages linked from webpages linked from an original webpage. I would like my output to be a table with rows corresponding to the contents of the third layer deep of pages. As you…

python scraperwiki

asked Jul 11 '13 at 10:16

toddntucker

votes

1 answer

How to return "N/A" given blank values in Python and ScraperWiki

Hi: I am new to Scraperwiki and Python, and trying to figure out how to return "NA" or something similar when there is no item on a scraped webpage that meets my cssselect specifications. In my code below, I am scraping a double-nested set of…

python scraperwiki

asked Jul 09 '13 at 14:52

toddntucker

votes

1 answer

Django Dynamic Scraper Project does not run on windows even though it works on Linux

I am trying to make a project in dynamic django scraper. I have tested it on linux and it runs properly. When I try to run the command: syndb i get this…

python django web-scraping scraper scraperwiki

asked Jun 28 '13 at 11:53

user4650611

votes

1 answer

Scraperwiki Twitter Query

Please forgive me, as I have limited knowledge of scraperwiki and twitter mining. I have the following code to scrape twitter data. However, I want to edit the code to only give me results that are geotagged for New York on a particular date (let's…

twitter scraperwiki

asked May 09 '13 at 22:51

user2368126

votes

1 answer

Scraperwiki character encoding anomaly

Here is a ScraperWiki scraper written in Python: import lxml.html import scraperwiki from unidecode import unidecode html =…

python unicode python-unicode scraperwiki

asked May 07 '13 at 19:28

user82216

votes

2 answers

Appending data to ScraperWiki datastore

Here is a simple Python script to store some data in ScraperWiki: import scraperwiki scraperwiki.sqlite.save(unique_keys=["a"], data={"a":1, "b":"Foo"}) scraperwiki.sqlite.save(unique_keys=["a"], data={"a":1, "c":"Bar"}) The result is the following…

python scraperwiki

asked May 07 '13 at 17:12

user82216

votes

1 answer

Debugging ScraperWiki scraper (producing spurious integer)

Here is a scraper I created using Python on ScraperWiki: import lxml.html import re import scraperwiki pattern = re.compile(r'\s') html = scraperwiki.scrape("http://www.shanghairanking.com/ARWU2012.html") root = lxml.html.fromstring(html) for tr in…

python screen-scraping scraperwiki

asked May 06 '13 at 10:44

user82216

votes

1 answer

Unicode issue with Python scraper

I've been writing bad perl for a while, but am attempting to learn to write bad python instead. I've read around the problem I've been having for a couple of days now (and know an awful lot more about unicode as a result) but I'm still having…

python unicode urllib2 scraperwiki

asked Apr 28 '13 at 18:48

mediaczar

1,960
3
18
23

votes

1 answer

Scraping links from more than one URL

I'm using ScraperWiki to pull in links from the london-gazette.co.uk site. How would I edit the code so that I can paste in a number of separate search URLs at the bottom which are all collated into the same datastore? At the moment I can just paste…

python url scraperwiki

asked Apr 15 '13 at 15:54

Henry Taylor

votes

1 answer

Using ScraperWiki to get information from a div element

Is there a way to get the data out of a div-container with ScraperWiki? I've got a line of HTML that is something like:

9.0 …

python web-scraping scraperwiki

asked Apr 14 '13 at 11:41

CanadaRunner

votes

1 answer

scraperwiki: why does my scraper work for 1 url but not another?

This is my first scraper https://scraperwiki.com/scrapers/my_first_scraper_1/ I managed to scrape google.com but not this page. http://subeta.net/pet_extra.php?act=read&petid=1014561 any reasons why? I have followed the documentation from…

php screen-scraping scraperwiki

asked Mar 02 '13 at 03:42

Kim Stacks

10,202
35
151
282

votes

1 answer

PHP FOR loop stops after 2 loops, exit status 139

I'm building a scraper with Scraper Wiki, here: https://scraperwiki.com/scrapers/fashfinder/edit/# Without boring you with too many details, I load up about 120 links into an array, $allLinks. Then, at the bottom of the page, I call a FOR loop on…

php web-scraping screen-scraping scraperwiki

asked Feb 23 '13 at 04:34

JVG

20,198
47
132
210

votes

3 answers

PHP variables in scraper function

I'm using ScraperWiki to build a simple screen scraper getting links from an online store. The store has multiple pages, so I want to get all the links from the first page, find the "next" button in the pager, go to that url, find all the links from…

php web-scraping scraperwiki

asked Feb 21 '13 at 23:27

JVG

20,198
47
132
210

votes

1 answer

Proxy / Fetch data from other countries

Certain websites require us to have a particular IP address to display certain information eg. ads for country X. I would like to know if it is possible to use a proxy (preferably ruby one) with my ruby script @scraperwiki to get the results as if I…

ruby proxy web-scraping scraperwiki

asked Feb 16 '13 at 14:39

Pedro Pereira

votes

1 answer

How to get selenium to work on scraperwiki

I love selenium and I love scraperwiki but somehow I cannot get them to work properly together. I've tried to open a website in two ways with selenium on scraperwiki, both methods have been gotten from tutorials: import selenium sel =…

python parsing selenium urllib2 scraperwiki

asked Jan 11 '13 at 17:39

cantdutchthis

31,949
17
74
114

Prev 1 2 3

5 Next