2

Boilerpipe is a great Java program for cleaning web pages and I've used it in the past. I note today that many users are not able to install the Python wrapper version and get 404 and other errors. Here is one of my attempts which I copied from conda.

/Users/duncan>sudo -H pip install https://pypi.python.org/packages/source/b/boilerpipe-py3/boilerpipe-py3-1.2.0.0.tar.gz Collecting https://pypi.python.org/packages/source/b/boilerpipe-py3/boilerpipe-py3-1.2.0.0.tar.gz Downloading boilerpipe-py3-1.2.0.0.tar.gz (1.3MB) 100% |████████████████████████████████| 1.3MB 436kB/s Complete output from command python setup.py egg_info: Traceback (most recent call last): File "", line 1, in File "/tmp/pip-r6swd0hy-build/setup.py", line 33, in download_jars(datapath=DATAPATH) File "/tmp/pip-r6swd0hy-build/setup.py", line 26, in download_jars urlretrieve(tgz_url, tgz_name) File "/Users/duncan/anaconda/lib/python3.5/urllib/request.py", line 188, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "/Users/duncan/anaconda/lib/python3.5/urllib/request.py", line 163, in urlopen return opener.open(url, data, timeout) File "/Users/duncan/anaconda/lib/python3.5/urllib/request.py", line 472, in open response = meth(req, response) File "/Users/duncan/anaconda/lib/python3.5/urllib/request.py", line 582, in http_response 'http', request, response, code, msg, hdrs) File "/Users/duncan/anaconda/lib/python3.5/urllib/request.py", line 510, in error return self._call_chain(*args) File "/Users/duncan/anaconda/lib/python3.5/urllib/request.py", line 444, in _call_chain result = func(*args) File "/Users/duncan/anaconda/lib/python3.5/urllib/request.py", line 590, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-r6swd0hy-build/

I've seen several solutions that didn't work for me (ex: google changed its URL for a module) If anyone has a solution I would be very grateful!

My platform here is a 16GB El Capitan but I've seen this reported on Ubuntu and other platforms. Thank you for any help here!

2 Answers2

1

I just forking and re-download the boilerpipe-1.2.0-bin.tar.gz from here code.google.com into my repository here https://github.com/slaveofcode/boilerpipe3

you can install boilerpipe by using pip

pip install boilerpipe3

or by project repository

pip install git+ssh://git@github.com/slaveofcode/boilerpipe3@master
Aditya Kresna Permana
  • 11,869
  • 8
  • 42
  • 48
0

I had the same issue, it is because boilerpipe URL has been moved. I worked around it by changing this line in setup.py inside the installation tar.gz from pypi:

Old line:
tgz_url = 'https://boilerpipe.googlecode.com/files/boilerpipe-{0}-bin.tar.gz'.format(version)

New line:
tgz_url = 'https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/boilerpipe/boilerpipe-{0}-bin.tar.gz'.format(version)

Re-compress the entire folder and run pip install on the new compressed directory.

Okiriza
  • 1
  • 1