0

My Python code can't parse "description" from RSS feed, when I run following script, it shows multiple lines of blanks, how could I parse it correctly?

import feedparser
import unidecode
rss_url = "http://my.blogspot.com/feeds/posts/default?alt=rss"
feed = feedparser.parse( rss_url )
for key in feed["entries"]:
    print unidecode.unidecode(key["description"])

RSS description section:

<description>&lt;iframe src=&quot;https://domain.com/embed/NTXFZhHw/01-10-1080p.mp4&quot; scrolling=&quot;no&quot; frameborder=&quot;0&quot; width=&quot;700&quot; height=&quot;430&quot; allowfullscreen=&quot;true&quot; webkitallowfullscreen=&quot;true&quot; mozallowfullscreen=&quot;true&quot;&gt;&lt;/iframe&gt;  </description>
Thomas G. Lau
  • 226
  • 3
  • 14
  • Does this occur with every feed or is it just one particular feed? If it is only one feed there might be a problematic character in one of the description fields. – Kmeixner May 20 '15 at 15:32
  • only my feed since my feed have tons of strange characters. How could I fix it? – Thomas G. Lau May 20 '15 at 23:34

1 Answers1

4

Please replace:

import feedparser

with:

import feedparser
feedparser._HTMLSanitizer.acceptable_elements.update(['iframe'])
smoothBlue
  • 203
  • 4
  • 12