I'm struggling with this for a while now. The following code snippet returns None
for some websites even if the charset presents in the meta of header, so it doesn't seem to be a reliable way to get the proper charset of a webpage.
conn = urllib2.urlopen(req)
charset = conn.headers.getparam('charset')
I read several threads here on SO and some mentions to use chardet
but I don't want to import an additional module if possible. Instead I'm thinking to download only the header and get the charset info by using some string functions.
Does anybody has a better idea?