0

I am creating an RSS reader and I want to pull the url attribute of media:content using feedparser on Google App Engine, but I am running into problems when an entry doesn't have a media_content attribute.

for feedURL in feedURLs:
        logging.debug('feedURL iteration')
        feed=feedparser.parse(feedURL.sourceLink)
        for entry in feed.entries:
            logging.debug('entry iteration')
            title=entry.get('title')
            link=entry.get('link')
            description=entry.get('description')
            pubDate=entry.get('pubDate')
            image=entry.get('image')
            mediaContent=entry.media_content

This works great if I eliminate the mediaContent line, but it fails when it is included. I think it is because only a few of the entries have media:content tags. Is there a way to get the url of the media:content tag when it exists and just have mediaContent set to None when it doesn't? Am I barking up the wrong tree?

This is the error in the log:

object has no attribute 'media_content' Traceback (most recent call last): File "/base/data/home/runtimes/python27/python27_lib/versions/third_party

Thanks!

1 Answers1

0

Among all kinds and relative versions of feeds, you will find very often these problems.

The docs says literally: "Feeds in the real world may be missing elements, even elements that are required by the specification. You should always test for the existence of an element before getting its value. Never assume an element is present." and suggest a solution:

'media_content' in entry
False # in your case or True if the element exist
Gianni Di Noia
  • 1,577
  • 10
  • 25