Questions tagged [boilerpipe]

The boilerpipe library for Java provides algorithms to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page.

The boilerpipe library for Java provides algorithms to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page.

77 questions
0
votes
1 answer

ImportError: No module named boilerpipe

Every time when I call the following code: from boilerpipe.extract import Extractor I get the error stating: Traceback (most recent call last): File "", line 1, in File "build/bdist.linux-x86_64/egg/boilerpipe/__init__.py", line…
khassan
  • 349
  • 1
  • 2
  • 5
0
votes
1 answer

I added a library directly to my src folder, Eclipse seems to be compiling properly, but can't find the class file?

I added the boilerpipe library directly to my src folder. Everything seems to be compiling when I run, but I get an error telling me that one of the classes in the boilerpipe library could not be resolved. The ArticleExtractor class is what I'm…
user3626745
  • 3
  • 1
  • 3
0
votes
0 answers

I Cleaned all projects, now when I try to run I get Error: Could not find or load main class?

I wanted to use boilerpipe so I added all .jars to the build path for my project. I did import de.l3s.boilerpipe.extractors.DefaultExtractor; and in one of my methods return(DefaultExtractor.INSTANCE.getText(someURL); Eclipse is telling me that…
user3626745
  • 3
  • 1
  • 3
0
votes
1 answer

Not able to parse new york times article using boilerpipe

I am trying to get news article from 'new york times' url but it is not giving any output, but if I try for any other newspaper it gives output. I want to know if something is wrong with my code or boilerpipe is not able to fetch it. Plus sometimes…
Rohan Singh Dhaka
  • 173
  • 2
  • 8
  • 33
0
votes
1 answer

python boilerpipe with google appengine getting import error?

I am trying work out python boilerpipe with google app engine. I have installed boilerpipe and its working fine in my local machine. boilerpipe installed using pip pip install boilerpipe github link sample program works fine with given url from…
kongaraju
  • 9,344
  • 11
  • 55
  • 78
0
votes
1 answer

python pip package installation JAVA_HOME error?

This issue regarding working with java and python. I would like to install, boilerpipe package using pip. I am working it from the last two days no use. pip install boilerpipe getting error JAVA_HOME not found. JAVA JDK and JRE both are…
user1834809
  • 1,311
  • 4
  • 17
  • 28
0
votes
2 answers

Boilerpipe to extract non-english news articles

I am trying to use boilerpipe to extract news articles from non-english text. I have already seen this and its not working for me. I made following changes 1) Modified HTMLfetcher.java. Appended following lines before end of method fetch byte[] utf8…
Cool Coder
  • 309
  • 2
  • 11
0
votes
2 answers

Coverting string into json

I extracted data from blogs using article extractor which returns articles in a string format. Since some pages have sub-links that go into news content I want that data to be extracted too. So, how can I access the data that is inside the…
chopu
  • 27
  • 1
  • 10
0
votes
2 answers

Article Extraction - Ruby

Is there any option to extract only the content from a webpage using ruby. (Avoid links and other stuffs)
Mothirajha
  • 1,033
  • 1
  • 10
  • 18
0
votes
1 answer

Retain boilerplate using boilerpipe

I am using boilerpipe library to analyzer news articles. There news articles contain a lot of boilerplate such as copyright information, side pane of related articles, etc. Boilerpipe removes all that information. Is it possible to return the…
abhinavkulkarni
  • 2,284
  • 4
  • 36
  • 54
0
votes
0 answers

Boilerpipe python wrapper: ImportError: No module named extract

I successfully installed JPype and Boilerpipe Python wrapper. My JAVA_HOME path is correct (as far as I know). I created a python file with the following code: from boilerpipe.extract import Extractor extractor =…
TheProofIsTrivium
  • 768
  • 2
  • 11
  • 25
0
votes
1 answer

Java - Boilerpipe running in Eclipse not working properly for a demo program

So I'm running boilerpipe in eclipse. I'm just trying to get it to work, here is the code.. package de.l3s.boilerpipe.demo; import java.net.URL; import de.l3s.boilerpipe.extractors.DefaultExtractor; public static void main(final String[] args)…
Ostap Hnatyuk
  • 1,116
  • 2
  • 14
  • 20
0
votes
2 answers

Boilerpipe Starter issue

I am new to boilerpipe. I tried to run sample code given on their website: import java.net.URL; import de.l3s.boilerpipe.extractors.ArticleExtractor; import de.l3s.boilerpipe.extractors.DefaultExtractor; public class TESTURLBOILERPIPE { …
Zahran
  • 419
  • 4
  • 10
0
votes
1 answer

how to run and get document stats from boilerpipe article extractor?

There's something I'm not quite understanding about the use of boilerpipe's ArticleExtractor class. Albeit, I am also very new to java, so perhaps my basic knowledge of this enviornemnt is at fault. anyhow, I'm trying to use boilerpipe to extract…
brneuro
  • 326
  • 1
  • 5
  • 15
0
votes
1 answer

How to install Boilerpipe on Windows?

Can anyone tell me how to use boilerpipe on windows with Netbeans ? I'll be grateful if you can give me some java code to start with it.
dark_shadow
  • 3,503
  • 11
  • 56
  • 81