3

When I parse the example rss link provided by BBC weather it gives only an empty feed, the example link is: "https://weather-broker-cdn.api.bbci.co.uk/en/forecast/rss/3day/2643123"

Ive tried using the feedparser module in python, I would like to do this in either python or c++ but python seemed easier. Ive also tried rewriting the URL without https:// and with .xml and it still doesn't work.

import feedparser
d = feedparser.parse('https://weather-broker-cdn.api.bbci.co.uk/en/forecast/rss/3day/2643123')
print(d)

Should give a result similar to the RSS feed which is on the link, but it just gets an empty feed

  • On my Synology Python 2.7.12 it outputs a dictionary as expected. On my laptop Win 64 Python 3.7.4. latest feedparser it throws a StopIteration Exception while processing the latitude and longitude in the `georss:point` tag. I tried executing the statements with the values feed to the `_gen_georss_coords` method in feedparser and it appears to work fine, so I'm somewhat baffled. – Deepstop Sep 07 '19 at 02:23

1 Answers1

1

First, I know you got no result - not an error like me. Perhaps you are running a different version. As I mentioned, it yields a result on an older version in Python 2, using a program that has been running solidly every night for about 5 years, but it throws an exception on a freshly installed feedparser 5.2.1 on Python 3.7.4 64 bit.

I'm not entirely sure what is going on, but the function called _gen_georss_coords which is throwing a StopIteration on the first call. I have noted some references to this error due to the implementation of PEP479. It is written as a generator, but for your rss feed it only has to return 1 tuple. Here is the offending function.

def _gen_georss_coords(value, swap=True, dims=2):
    # A generator of (lon, lat) pairs from a string of encoded GeoRSS
    # coordinates. Converts to floats and swaps order.
    latlons = map(float, value.strip().replace(',', ' ').split())
    nxt = latlons.__next__
    while True:
        t = [nxt(), nxt()][::swap and -1 or 1]
        if dims == 3:
            t.append(nxt())
        yield tuple(t)

There is something curious going on, perhaps to do with PEP479 and the fact that there are two separate generators happening in the same function, that is causing StopIteration to bubble up to the calling function. Anyway, I rewrote it is a somewhat more straightforward way.

def _gen_georss_coords(value, swap=True, dims=2):
    # A generator of (lon, lat) pairs from a string of encoded GeoRSS
    # coordinates. Converts to floats and swaps order.
    latlons = list(map(float, value.strip().replace(',', ' ').split()))
    for i in range(0, len(latlons), 3):
        t = [latlons[i], latlons[i+1]][::swap and -1 or 1]
        if dims == 3:
            t.append(latlons[i+2])
        yield tuple(t)

You can define the above new function in your code, then execute the following to patch it into feedparser

saveit, feedparser._gen_georss_coords = (feedparser._gen_georss_coords, _gen_georss_coords)

Once you're done with it, you can restore feedparser to its previous state with

feedparser._gen_georss_coords, _gen_georss_coords = (saveit, feedparser._gen_georss_coords)

Or if you're confident that this is solid, you can modify feedparser itself. Anyway I did this trick and your rss feed suddenly started working. Perhaps in your case it will also result in some improvement.

Deepstop
  • 3,627
  • 2
  • 8
  • 21
  • Wow thanks you really went all out but it helps loads. Do you think it's worth mentioning this to the lib creator? – user3486373 Sep 07 '19 at 10:20
  • I searched on `_gen_georss_coords` and found that the issue was identified:months ago and was the result of changes in Python 3.7: https://github.com/custom-components/feedparser/issues/10 but the new version hasn't been been released. Current version is from 2015. Apparently there is a 'hotfix' available. – Deepstop Sep 07 '19 at 12:05