2

Im trying to parse a feed in python using feedparser. But all I get is None returned. Im not sure what im missing. Here is my code:

import feedparser

def rss(self):
    rss = 'https://news.google.com/news?q=fashion&output=rss'
    feed = feedparser.parse(rss)
    for key in feed.entries: 
        return key.title

If you think there is a better rss/xml feed parse. Please let me know. (Im new to python)

print(key) displays none and print(len(feed.entries)) also displays none

print(feed)
{'feed': {}, 'entries': [], 'bozo': 1, 'bozo_exception': URLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)'),)}

print(feedparser)
<module 'feedparser' from '/Users/User_name/python-projects/my_env/lib/python3.6/site-packages/feedparser.py'>
kevinabraham
  • 1,378
  • 4
  • 28
  • 55

2 Answers2

1

Try the following basic code, which works fine for me and gave me 10 items in the feed when I ran it.

  1. Install feedparser from pip
pip install feedparser
  1. Usage
import urllib2
import feedparser

url = "https://news.google.com/news?q=fashion&output=rss"
response = urllib2.urlopen(url).read()

print response

d = feedparser.parse(response)
print len(d.entries)
for item in d.entries:
    print "------"
    print item.title
    if 'subtitle' in item:
        print item.subtitle
    print item.link
    print item.description
    print item.published
    print item.id
    print item.updated
    if 'content' in item:
        print item.content

Or, paste the FULL code you're running, and I'll take a look.

Gedeon Mutshipayi
  • 2,871
  • 3
  • 21
  • 42
AS Mackay
  • 2,831
  • 9
  • 19
  • 25
  • @kevinabraham I think that means you're simply unable to read data from the feed, it's not a Python coding issue. Do you see data if you try that URL direct from a web browser? – AS Mackay Jul 10 '17 at 10:54
  • Yes I can. When I go directly in the link it shows text `NFE/1.0fashion - Google News`...... – kevinabraham Jul 10 '17 at 10:57
1

Figured out the issue was actually with the SSL handshake fixed it by adding ssl._create_default_https_context = ssl._create_unverified_context.

For anyone else facing the issue. Full code is:

import feedparser
import ssl
if hasattr(ssl, '_create_unverified_context'):
    ssl._create_default_https_context = ssl._create_unverified_context
rss = 'https://news.google.com/news?q=fashion&output=rss'
feed = feedparser.parse(rss)

print(feed)
kevinabraham
  • 1,378
  • 4
  • 28
  • 55
  • Doing so, will allow anyone with a privileged network position is able to trivially execute a man in the middle attack against a Python application using either of these HTTP clients, and change traffic at will. [Enabling certificate verification by default for stdlib http clients](https://www.python.org/dev/peps/pep-0476/) – stovfl Jul 10 '17 at 19:18
  • @stovfl oh right. Would it be better to use `create_default_context` instead of `_create_unverified_context`? – kevinabraham Jul 11 '17 at 10:40
  • Neither the first nor the second. I would go to install Certificates using `pip install urllib3[secure]`, read about [Section: Certificate verification](http://urllib3.readthedocs.io/en/latest/user-guide.html?highlight=certificate) – stovfl Jul 11 '17 at 13:17