1

I am trying to use use the .read_html() function in the pandas library and keep getting this error when I run the code in the shell. I saw that you need to install the lxml so I did that using apt-get. But afterwards when I tried to run it again I was getting the same error.

(trusty)mdz5032@localhost:~$ sudo apt-get -y install python-lxml
[sudo] password for mdz5032: 
Reading package lists... Done
.
.
.
python-lxml is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.

Here is the code that I am using

import pandas as pd
import pandas_datareader.data as web
import quandl


df = quandl.get("FMAC/HPI_PA", authtoken="")

fiddy_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')

I took out the api key but can post it if it is needed.

Here is the full traceback

Traceback (most recent call last):
  File "/home/mdz5032/pandasPractice.py", line 9, in <module>
    fiddy_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
  File "/usr/local/lib/python3.4/dist-packages/pandas/io/html.py", line 874, in read_html
    parse_dates, tupleize_cols, thousands, attrs, encoding)
  File "/usr/local/lib/python3.4/dist-packages/pandas/io/html.py", line 726, in _parse
    parser = _parser_dispatch(flav)
  File "/usr/local/lib/python3.4/dist-packages/pandas/io/html.py", line 685, in _parser_dispatch
    raise ImportError("lxml not found, please install it")
ImportError: lxml not found, please install it
Mark
  • 1,051
  • 3
  • 13
  • 17
  • Are you using a virtualenv? What happens when you run ``python -c 'import lxml'``? Can you put the error message? – notorious.no Jul 18 '16 at 16:25
  • Did you check which folder `apt-get` is installing to? It happens that it may simply be saving installed modules in the wrong folder - this has happened to me in the past. You can work around this by using pip instead of apt-get as your Python package manager. – Akshat Mahajan Jul 18 '16 at 16:27
  • @notorious when i do python -c 'import lxml' Im not getting any errors – Mark Jul 18 '16 at 17:05
  • @AkshatMahajan How do I check that, I am still pretty new to all this linux business. I tried using pip3 for python-lxml but it said it couldnt be found. I assumed then I had to use apt-get – Mark Jul 18 '16 at 17:06

1 Answers1

2
sudo apt-get install python3-lxml

You've installed lxml for python2, but your code is running under python3.

Winston Ewert
  • 44,070
  • 10
  • 68
  • 83
  • I just tried that and now I am getting a new error with that install [E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?] Tried doing an update but no dice – Mark Jul 18 '16 at 17:18
  • Here is other part of error Install these packages without verification? [y/N] y Err http://archive.ubuntu.com/ubuntu/ trusty-updates/main python3-lxml amd64 3.3.3-1ubuntu0.1 Could not resolve 'archive.ubuntu.com' Err http://archive.ubuntu.com/ubuntu/ trusty-security/main python3-lxml amd64 3.3.3-1ubuntu0.1 Could not resolve 'archive.ubuntu.com' E: Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/l/lxml/python3-lxml_3.3.3-1ubuntu0.1_amd64.deb Could not resolve 'archive.ubuntu.com' – Mark Jul 18 '16 at 17:19
  • wait, I used apt-get -f install and fixed the package, Thanks for the help! – Mark Jul 18 '16 at 17:35