I have a list of html pages which may contain certain encoded characters. Some examples are as below -
<a href="mailto:lad%20at%20maestro%20dot%20com">
<em>ada@graphics.maestro.com</em>
<em>mel@graphics.maestro.com</em>
I would like to decode (escape, I'm unsure of the current terminology) these strings to -
<a href="mailto:lad at maestro dot com">
<em>ada@graphics.maestro.com</em>
<em>mel@graphics.maestro.com</em>
Note, the HTML pages are in a string format. Also, I DO NOT want to use any external library like a BeautifulSoup or lxml, only native python libraries are ok.
Edit -
The below solution isn't perfect. HTML Parser unescaping with urllib2 throws a
UnicodeDecodeError: 'ascii' codec can't decode byte 0x94 in position 31: ordinal not in range(128)
error in some cases.