Highest Voted 'scraper' Questions

3

votes

3 answers

How do create a HTML scraper in PHP and get it working properly?

Please HELP! :( I am looking to develop a PHP Script to do the following: Scrap a remote HTML page and extract selected data (e.g. particular table/div) Use extracted data and save it into a Database (e.g. MySql) Anyone can help out? Thanks and…

php mysql scraper

asked Aug 24 '10 at 10:29

user429384

31
1
2

3

votes

1 answer

BeautifulSoup MemoryError When Opening Several Files in Directory

Context: Every week, I receive a list of lab results in the form of an html file. Each week, there are about 3,000 results with each set of results having between two and four tables associated with them. For each result/trial, I only care about…

python memory web-scraping beautifulsoup scraper

asked Apr 27 '15 at 19:18

JohnR4785

73
3

3

votes

3 answers

Facebook like on demand meta content scraper

you guys ever saw that FB scrapes the link you post on facebook (status, message etc.) live right after you paste it in the link field and displays various metadata, a thumb of the image, various images from the a page link or a video thumb from a…

php facebook metadata scraper

asked Jun 03 '10 at 01:49

Toby

2,720
5
29
46

3

votes

1 answer

Beautiful Soup nested div (Adding extra function)

I am trying to extract Company Name, address, and zipcode from [www.quicktransportsolutions.com][1]. I have written the following code to scrawl the site and return the information I need. import requests from bs4 import BeautifulSoup def…

python python-2.7 html-parsing beautifulsoup scraper

asked Sep 20 '14 at 23:52

icomefromchaos

225
4
13

3

votes

0 answers

Scraping (multiple) web-page(s) and detect changes

I'm currently trying to write down a concept how I could solve following thing: In Java I'm currently scraping a web-page with articles. If any of these articles get available or change somehow it should give me an alert. The scraping of all the…

java web-scraping screen-scraping scraper

asked Jul 23 '14 at 19:55

pythoniosIV

237
5
18

3

votes

5 answers

Scrape data from HTML pages using Java, output to database

I need to know how to create a scraper (in Java) to gather data from HTML pages and output to a database...do not have a clue where to start so any information you can give me on this would be great. Also, you can't be too basic or simple…

java scraper

asked Mar 18 '10 at 15:29

Tanith

31
1
1
2

3

votes

1 answer

Rails - render :content_type has no effect

I'm developing a Ruby/Rails app which scrapes another website and renders an RSS feed with the data. Because this app is built on Heroku, I am generating the RSS feed via a controller, rather than dumping it to the file-system and serving it as an…

ruby-on-rails-3 heroku rss mime-types scraper

asked Feb 03 '13 at 21:13

Daniel B.

1,650
1
19
40

3

votes

5 answers

How to extract the text between some anchor tags?

I need to extract the name of the artists from an HTML page. Here's a snippet of the page:

python anchor beautifulsoup scraper

asked Nov 06 '12 at 08:52

muchacho

55
1
6

3

votes

1 answer

How can I make my scraper website-design-change-tolerant?

I have written a web scraper in ruby . But the websites that I am scraping hav changed their design.Thus my scraper is failing. Is there a smart and simple solution to solve this kind of an inherent problem of scrapers? (for eg.. using some kind of…

ruby web-crawler scraper

asked Jul 14 '12 at 01:27

HPC_wizard

179
3
11

3

votes

3 answers

Why can I not scrape the title off this site?

I'm using simple-html-dom to scrape the title off of a specified site. find('title') as $element) echo $element->innertext .…

php html simple-html-dom scraper

asked Jul 12 '12 at 21:29

Alex

103
2
8

2

votes

0 answers

OGP endpoints that point to Facebook entities being incorrectly parsed by FB crawler?

Our app renders Like buttons that point back to an actual Facebook page. However, instead of pointing the Like button's href directly to the FB url, we proxy it through our servers through an opengraph endpoint. This is helpful because it allows us…

facebook-opengraph scraper

asked Dec 02 '11 at 23:47

diurnalist

408
3
9

2

votes

1 answer

Scrape A Price Div Class From the Page Php

php jquery screen-scraping web-scraping scraper

asked Sep 18 '11 at 22:59

m67

23
6

2

votes

3 answers

Ruby Mechanize web scraper library returns file instead of page

I have recently been using the Mechanize gem in ruby to write a scraper. Unfortunately, the URL that I am attempting to scrape returns a Mechanize::File object instead of a Mechanize::Page object upon a GET request. I can't figure out why. Every…

ruby object mechanize scraper

asked Aug 02 '11 at 20:36

JRPete

3,074
3
19
17

2

votes

0 answers

How to get table row of a website that updates dynamically (simple html DOM parser)?

Basically what I want to do is get a particular table row of a website. the table has an id of "table-data". I have already written the PHP but I noticed that the file_get_html doesn't actually get the data that is dynamically loaded. How should I…

php simple-html-dom scraper

asked Sep 06 '18 at 13:07

HessamSH

357
1
5
18

2

votes

1 answer

Trying to Scrape Reddit with praw.Reddit

Im trying to scrape Reddit with the praw.reddit command and I keep getting the following: prawcore.exceptions.OAuthException: unauthorized_client error processing request (Only script apps may use password auth) Heres the top of my code:(I removed…

python reddit scraper praw

asked Jun 21 '18 at 16:33

bullybear17

859
2
13
31

Prev 1 2 3

…

23 24 Next

Questions tagged [scraper]