Questions tagged [scraperwiki]

ScraperWiki was an online tool for Screen Scraping.

ScraperWiki ScraperWiki was a platform for writing and scheduling screen scrapers, and for storing the data they generate. It support Ruby, Python and PHP. A later version of the service was called QuickCode, which has also been decommissioned.

"Scraper" refers to screen scrapers, programs that extract data from websites. "Wiki" means that any user with programming experience can create or edit such programs for extracting new data, or for analyzing existing datasets.

68 questions
0
votes
2 answers

How to scrape more than first instance of triple-nested list of links in Python?

I am trying to determine the simplest way to record the contents of webpages linked from webpages linked from an original webpage. I would like my output to be a table with rows corresponding to the contents of the third layer deep of pages. As you…
toddntucker
  • 43
  • 1
  • 7
0
votes
1 answer

How to return "N/A" given blank values in Python and ScraperWiki

Hi: I am new to Scraperwiki and Python, and trying to figure out how to return "NA" or something similar when there is no item on a scraped webpage that meets my cssselect specifications. In my code below, I am scraping a double-nested set of…
toddntucker
  • 43
  • 1
  • 7
0
votes
1 answer

Django Dynamic Scraper Project does not run on windows even though it works on Linux

I am trying to make a project in dynamic django scraper. I have tested it on linux and it runs properly. When I try to run the command: syndb i get this…
user4650611
0
votes
1 answer

Scraperwiki Twitter Query

Please forgive me, as I have limited knowledge of scraperwiki and twitter mining. I have the following code to scrape twitter data. However, I want to edit the code to only give me results that are geotagged for New York on a particular date (let's…
0
votes
1 answer

Scraperwiki character encoding anomaly

Here is a ScraperWiki scraper written in Python: import lxml.html import scraperwiki from unidecode import unidecode html =…
user82216
0
votes
2 answers

Appending data to ScraperWiki datastore

Here is a simple Python script to store some data in ScraperWiki: import scraperwiki scraperwiki.sqlite.save(unique_keys=["a"], data={"a":1, "b":"Foo"}) scraperwiki.sqlite.save(unique_keys=["a"], data={"a":1, "c":"Bar"}) The result is the following…
user82216
0
votes
1 answer

Debugging ScraperWiki scraper (producing spurious integer)

Here is a scraper I created using Python on ScraperWiki: import lxml.html import re import scraperwiki pattern = re.compile(r'\s') html = scraperwiki.scrape("http://www.shanghairanking.com/ARWU2012.html") root = lxml.html.fromstring(html) for tr in…
user82216
0
votes
1 answer

Unicode issue with Python scraper

I've been writing bad perl for a while, but am attempting to learn to write bad python instead. I've read around the problem I've been having for a couple of days now (and know an awful lot more about unicode as a result) but I'm still having…
mediaczar
  • 1,960
  • 3
  • 18
  • 23
0
votes
1 answer

Scraping links from more than one URL

I'm using ScraperWiki to pull in links from the london-gazette.co.uk site. How would I edit the code so that I can paste in a number of separate search URLs at the bottom which are all collated into the same datastore? At the moment I can just paste…
0
votes
1 answer

Using ScraperWiki to get information from a div element

Is there a way to get the data out of a div-container with ScraperWiki? I've got a line of HTML that is something like:
9.0
CanadaRunner
  • 65
  • 10
0
votes
1 answer

scraperwiki: why does my scraper work for 1 url but not another?

This is my first scraper https://scraperwiki.com/scrapers/my_first_scraper_1/ I managed to scrape google.com but not this page. http://subeta.net/pet_extra.php?act=read&petid=1014561 any reasons why? I have followed the documentation from…
Kim Stacks
  • 10,202
  • 35
  • 151
  • 282
0
votes
1 answer

PHP FOR loop stops after 2 loops, exit status 139

I'm building a scraper with Scraper Wiki, here: https://scraperwiki.com/scrapers/fashfinder/edit/# Without boring you with too many details, I load up about 120 links into an array, $allLinks. Then, at the bottom of the page, I call a FOR loop on…
JVG
  • 20,198
  • 47
  • 132
  • 210
0
votes
3 answers

PHP variables in scraper function

I'm using ScraperWiki to build a simple screen scraper getting links from an online store. The store has multiple pages, so I want to get all the links from the first page, find the "next" button in the pager, go to that url, find all the links from…
JVG
  • 20,198
  • 47
  • 132
  • 210
0
votes
1 answer

Proxy / Fetch data from other countries

Certain websites require us to have a particular IP address to display certain information eg. ads for country X. I would like to know if it is possible to use a proxy (preferably ruby one) with my ruby script @scraperwiki to get the results as if I…
Pedro Pereira
  • 480
  • 5
  • 12
0
votes
1 answer

How to get selenium to work on scraperwiki

I love selenium and I love scraperwiki but somehow I cannot get them to work properly together. I've tried to open a website in two ways with selenium on scraperwiki, both methods have been gotten from tutorials: import selenium sel =…
cantdutchthis
  • 31,949
  • 17
  • 74
  • 114