0

I've looked through the Python feedparser documentation and done enough Googling, but not finding any example feeds that look like what I'm working with:

http://smrss.neulion.com/u/nhl/mrss/sights-and-sounds/vod.xml

What I'm trying to access is the mp4 URLs in the media:group --> media:content element in each item in the feed.

Here's my code so far:

#! /usr/bin/python
# -*- coding: utf-8 -*-

import feedparser

d = feedparser.parse('http://smrss.neulion.com/u/nhl/mrss/sights-and-sounds/vod.xml')

for index,item in enumerate(d.entries):
    if index >= 4:
        print item.title
        print item.media_content
        print item.summary

What prints out to Terminal for item.media_content is:

[{'duration': u'150', 'url': u'http://smrss.neulion.com/spmrss/s/nhl/vod/flv/2015/04/19/811204_20150418_PIT_NYR_WIRELESS_1800_sd.mp4', 'type': u'video_sd.mp4'}]

This is a dictionary inside of a list, yes? What would be the best way to iterate through this dictionary in my for loop so I can extract the value at the 'url' key?

AdjunctProfessorFalcon
  • 1,790
  • 6
  • 26
  • 62

2 Answers2

1

if item.media_content is always a list with one dictionary, just do this:

for key, val in item.media_content[0].iteritems():
    print key, val
Julien Spronck
  • 15,069
  • 4
  • 47
  • 55
  • Thank you very much for breaking that down! I was missing the [0] — I'm assuming we need to tell Python which index the list is at, even if there's only one media_content list per item in the feed? – AdjunctProfessorFalcon May 01 '15 at 19:14
  • You're welcome :-) Indeed, you first tell Python to get the first item of the list. – Julien Spronck May 01 '15 at 19:42
0

I'd recommend using BeautifulSoup :

import urllib
from bs4 import BeautifulSoup
url = "http://smrss.neulion.com/u/nhl/mrss/sights-and-sounds/vod.xml"
vod = urllib.urlopen(url)



In [1752]: [i['url'] for i in soup.findAll('media:content') if i.has_attr('url')]
Out[1752]: 
['http://smrss.neulion.com/spmrss/s/nhl/vod/flv/2015/04/30/817293_C150008B_20150428_ROUND_ONE_WIRELESS_RECAP_SHORT_5000_sd.mp4',
 'http://smrss.neulion.com/spmrss/s/nhl/vod/flv/2015/04/28/816995_20150427_NHL_Playoff_Access_NYI_WSH_GM7_5000_sd.mp4',
 'http://smrss.neulion.com/spmrss/s/nhl/vod/flv/2015/04/26/816230_20150426_WIRELESS_RECAP_5000_sd.mp4',
 'http://smrss.neulion.com/spmrss/s/nhl/vod/flv/2015/04/25/815823_20150425_WIRELESS_GM5_OTT_5000_sd.mp4',
fixxxer
  • 15,568
  • 15
  • 58
  • 76