I'm trying to retrieve some news from the BBC rss feed and save certain parts locally in xml (althought this code only prints it). I seem to be able to retrieve everything I want except for the pubDate. I get the error
"File "/Library/Python/2.7/site-packages/feedparser.py", line 416, in __getattr__
raise AttributeError, "object has no attribute '%s'" % key
AttributeError: object has no attribute 'pubDate'"
I'm not sure why as everything else I've wanted to retrieve hasn't caused any problems. Here is the code:
import feedparser
import xml.etree.cElementTree as ET
from xml.dom import minidom
BBCHome = feedparser.parse ('http://feeds.bbci.co.uk/news/rss.xml')
def prettify(elem):
rough_string = ET.tostring(elem, 'utf-8')
reparsed = minidom.parseString(rough_string)
return reparsed.toprettyxml(indent=" ")
root = ET.Element('root')
for story in BBCHome.entries:
item = ET.SubElement(root,'item')
title = ET.SubElement(item,'title')
title.text = story.title
# why doesn't pubDate work?
pubDate = ET.SubElement (item,'pubDate')
pubDate.text = story.pubDate
description = ET.SubElement(item,'description')
description.text = story.description
link = ET.SubElement(item,'link')
link.text = story.link
print prettify(root)
Reading this page : https://pythonhosted.org/feedparser/namespace-handling.html I think it might have something to do with namespaces but tbh I don't really understand. I've looked at the raw feed and it seems like just another sub element of item similar to description or title.
If I could find out how to fix this and why it wasn't working it would be greatly appreciated. Thanks.