hi i am having a problem downloading and reading in a rss feed from a particular site, the issue seems to be that the resulting downloaded rss feed looks to be in binary format, can anybody tell me how i can get this back into a readable format that i can then send to beautiful soup for parsing?.
here is my code so far:-
import urllib2
from BeautifulSoup import BeautifulSoup
rss_feed = urllib2.urlopen("http://kat.ph/usearch/ubuntu/?rss=1", timeout=5.0).read()
print rss_feed #will display binary not expected xml
rss_feed_soup = BeautifulSoup(rss_feed)
so just to clarify i cannot seem to get the resulting xml when trying to read using urllib2, if i view the rss feed in any modern web browser the rss is displayed correctly, what am i missing here? , is the rss feed binary encoded and if so how do i correctly decode it?.
thanks for any replies.