2

I have had great success parsing RSS feeds from the National Hurricane Center using the feedparser module:

import feedparser
feedparser.parse('https://www.nhc.noaa.gov/gis-at.xml') #Works Fine
feedparser.parse('https://www.nhc.noaa.gov/gis-ep.xml') #Works Fine

However, when I try to read the superficially similar feed from the Central Pacific Hurricane Center, I generate a KeyError:

feedparser.parse('http://www.prh.noaa.gov/cphc/gis-cp.xml') #Doesn't work

Is this a bug with feedparser? Is the CPHC's feed malformed? Is there an option that I've forgotten to specify? It seems the trouble is that there isn't a key named 'where', but I don't know why this isn't a problem for the NHC feeds. The stack is reproduced below:

>>>  import feedparser
>>>  feedparser.parse('http://www.prh.noaa.gov/cphc/gis-cp.xml')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../anaconda3/lib/python3.6/site-packages/feedparser.py", line 3956, in parse
    saxparser.parse(source)
  File ".../anaconda3/lib/python3.6/xml/sax/expatreader.py", line 111, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File ".../anaconda3/lib/python3.6/xml/sax/xmlreader.py", line 125, in parse
    self.feed(buffer)
  File ".../anaconda3/lib/python3.6/xml/sax/expatreader.py", line 217, in feed
    self._parser.Parse(data, isFinal)
  File "/tmp/build/80754af9/python_1516124163501/work/Modules/pyexpat.c", line 414, in StartElement
  File ".../anaconda3/lib/python3.6/xml/sax/expatreader.py", line 370, in start_element_ns
    AttributesNSImpl(newattrs, qnames))
  File ".../anaconda3/lib/python3.6/site-packages/feedparser.py", line 2031, in startElementNS
    self.unknown_starttag(localname, list(attrsD.items()))
  File ".../anaconda3/lib/python3.6/site-packages/feedparser.py", line 666, in unknown_starttag
    return method(attrsD)
  File ".../anaconda3/lib/python3.6/site-packages/feedparser.py", line 1500, in _start_gml_point
    self._parse_srs_attrs(attrsD)
  File ".../anaconda3/lib/python3.6/site-packages/feedparser.py", line 1496, in _parse_srs_attrs
    context['where']['srsName'] = srsName
  File ".../anaconda3/lib/python3.6/site-packages/feedparser.py", line 356, in __getitem__
    return dict.__getitem__(self, key)
KeyError: 'where'

1 Answers1

0

I know this is an old question but I faced myself this issue and became my first opensource contribution :)

Is this a bug with feedparser?

Yes, it was.

Is the CPHC's feed malformed?

Also yes, or at least it doesn't follow the GeoRSSS GML model to the letter. If you check the GMLPoint description you will see the following structure:

<georss:where>
  <gml:Point>
    <gml:pos>45.256 -71.92</gml:pos>
  </gml:Point>
</georss:where>

but the feed data is structured this way:

<gml:Point>
  <gml:pos>45.256 -71.92</gml:pos>
</gml:Point> 

So that's why the KeyError: 'where' occurs, due to the absent of where tag.

This was fixed on feedparser's 6.0.9 hotfix (see https://github.com/kurtmckee/feedparser/pull/306)

Nestor
  • 519
  • 1
  • 7
  • 21