2

I'm getting the error in title when trying to parse a HTML with soupparser - external interface to the BeautifulSoup HTML parser. This is my code:

from lxml.html.soupparser import fromstring
fromstring("<html><body></body></html>");

Also, since I'm using Anaconda's Python distribution, I loaded BeautifulSoup like this:

import sys, bs4
sys.modules['BeautifulSoup'] = bs4

Error I'm getting is: TypeError: __init__() got an unexpected keyword argument 'convertEntities' when soupparser calls bs4 with:

if 'convertEntities' not in bsargs:
  bsargs['convertEntities'] = 'html'
  tree = beautifulsoup(source, **bsargs)

Also, first time I run it within IPython notebook, I get the following warning:

.../python2.7/site-packages/bs4/__init__.py:88: UserWarning: BS4 does not respect the convertEntities argument to the BeautifulSoup constructor. Entities are always converted to Unicode characters.
Tommz
  • 3,393
  • 7
  • 32
  • 44

0 Answers0