I am trying to use use the .read_html() function in the pandas library and keep getting this error when I run the code in the shell. I saw that you need to install the lxml so I did that using apt-get. But afterwards when I tried to run it again I was getting the same error.
(trusty)mdz5032@localhost:~$ sudo apt-get -y install python-lxml
[sudo] password for mdz5032:
Reading package lists... Done
.
.
.
python-lxml is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.
Here is the code that I am using
import pandas as pd
import pandas_datareader.data as web
import quandl
df = quandl.get("FMAC/HPI_PA", authtoken="")
fiddy_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
I took out the api key but can post it if it is needed.
Here is the full traceback
Traceback (most recent call last):
File "/home/mdz5032/pandasPractice.py", line 9, in <module>
fiddy_states = pd.read_html('https://simple.wikipedia.org/wiki/List_of_U.S._states')
File "/usr/local/lib/python3.4/dist-packages/pandas/io/html.py", line 874, in read_html
parse_dates, tupleize_cols, thousands, attrs, encoding)
File "/usr/local/lib/python3.4/dist-packages/pandas/io/html.py", line 726, in _parse
parser = _parser_dispatch(flav)
File "/usr/local/lib/python3.4/dist-packages/pandas/io/html.py", line 685, in _parser_dispatch
raise ImportError("lxml not found, please install it")
ImportError: lxml not found, please install it