Questions tagged [data-harvest]

24 questions
0
votes
1 answer

Harvesting data with rvest retrieves no value from data-widget

I'm trying to harvest data using rvest (also tried using XML and selectr) but I am having difficulties with the following problem: In my browser's web inspector the html looks like
greyBag
  • 387
  • 3
  • 14
0
votes
0 answers

Harvesting data from webpage in R - accessing multiple pages

I am following my question from yesterday - harvesting data via drop down list in R 1 first, I need to obtain all 50k strings of details of all doctors from this page: http://www.lkcr.cz/seznam-lekaru-426.html#seznam I know, how to obtain them from…
johnnyheineken
  • 543
  • 7
  • 20
0
votes
0 answers

CKAN: harvest blocked

I don't know exactly where is the problem so i write here to get some tips or clues about it. I would like to know if anybody has an opinion or an idea about it. The harvesting with ckan seems to work (i am able to get the datas on the open data…
0
votes
1 answer

CKAN harvester 'nav_named_link' error

In CKAN, when I try to create a new Harvest Source I get this error: Error - : 'ckan.lib.helpers.HelperAttributeDict object' has no attribute 'nav_named_link URL: https://127.0.0.1:5000/harvest/new Does…
user3673449
  • 347
  • 2
  • 5
  • 20
0
votes
2 answers

Looking up multiple values from a single cell

I have a data set where a lot of different categories and data were crammed into one cell. For example, I have one cell that has names of individuals and a percentage: Jess 15%, Frank 20%, Allan 50%, Steve 15% I would like to find a function that…
0
votes
1 answer

ckan harvester: "No module named pika" error

On a ckan instance running ok, I installed the harvester extension following this guide: https://github.com/ckan/ckanext-harvest these are the steps I followed: . /usr/lib/ckan/default/bin/activate cd /usr/lib/ckan/default/src/ckan sudo pip install…
opensas
  • 60,462
  • 79
  • 252
  • 386
0
votes
1 answer

Harvesters using DCAT extension get stucked

We've been using ckanext-dcat to harvest from remote json sources, sometimes some harvest jobs didn't finish and had to be deleted along with all the datasets from that source, which is not very convinient but then all goes back to normal, I don't…
Urkonn
  • 90
  • 6
-1
votes
1 answer

learning Data harvesting

I want to build a website that will harvest data from: *facebook status of my friends *other website Unfortenatly, I don't know how to harvest data. Can someone recommend of a book\tutorial ? How to approch this field?
Elad Benda
  • 35,076
  • 87
  • 265
  • 471
-2
votes
1 answer

Unable to extract web content(href tags) I'm using python 3.7

unable to scrape @href tags from "https://www.theaic.co.uk/aic/analyse-investment-companies" I'm using Python 3.7,scrapy, splash and also tried with selenium but no use.
1
2