Questions tagged [feedparser]

A Python library that parses feeds in all known formats, including Atom, RSS, and RDF.

Universal feed parser, handles RSS 0.9x, RSS 1.0, RSS 2.0, CDF, Atom 0.3, and Atom 1.0 feeds.

Links:

362 questions
2
votes
1 answer

Checking for updated RSS feeds with Feedzirra

I am using Feedzirra to parse my RSS feeds and it works very well -- it is twice as fast Feed Normalizer in my initial testing. More importantly, it has nice wrappers that check for updated entries inside a feed. When I was using its feed-update…
Ecognium
  • 2,046
  • 1
  • 19
  • 35
2
votes
1 answer

Gmail feed retrieve email text

Is it possible to retrieve complete email messages/ not just summary from the gmail feed? Is there a way to change the feed? I tried parsing a feed and only got this: Gmail - Inbox for…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/google-app-engine" class="post-tag grid--cell" title="show questions tagged 'google-app-engine'" rel="tag">google-app-engine</a> <a href="../../questions/tagged/gmail" class="post-tag grid--cell" title="show questions tagged 'gmail'" rel="tag">gmail</a> <a href="../../questions/tagged/atom-feed" class="post-tag grid--cell" title="show questions tagged 'atom-feed'" rel="tag">atom-feed</a> <a href="../../questions/tagged/feed" class="post-tag grid--cell" title="show questions tagged 'feed'" rel="tag">feed</a> <a href="../../questions/tagged/feedparser" class="post-tag grid--cell" title="show questions tagged 'feedparser'" rel="tag">feedparser</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Jan 04 '14 at 01:38">asked Jan 04 '14 at 01:38</time> <a href="../../users/1114100/vlad-otrocol" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/1114100.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Vlad Otrocol" /> </a> <div class="s-user-card--info"> <a href="../../users/1114100/vlad-otrocol" class="s-user-card--link">Vlad Otrocol</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">2,952</li> <li class="s-award-bling s-award-bling__gold" title="7 gold badges">7</li> <li class="s-award-bling s-award-bling__silver" title="33 silver badges">33</li> <li class="s-award-bling s-award-bling__bronze" title="55 bronze badges">55</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-2080071"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>2</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status answered-accepted"> <strong>2</strong> answers </div> </div> </div> <div class="summary"> <h3><a href="../../questions/2080071/undesired-python-feedparser-instantiation-relic" class="question-hyperlink">Undesired python feedparser instantiation relic</a></h3> <div class="excerpt">Question: How do I kill an instantiation or insure i'm creating a new instantiation of the python universal feedparser? Info: I'm working on a program right now that downloads and catalogs large numbers of blogs. It has worked well so for except…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/python" class="post-tag grid--cell" title="show questions tagged 'python'" rel="tag">python</a> <a href="../../questions/tagged/feedparser" class="post-tag grid--cell" title="show questions tagged 'feedparser'" rel="tag">feedparser</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Jan 17 '10 at 05:25">asked Jan 17 '10 at 05:25</time> <a href="../../users/186608/narcolapser" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/186608.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Narcolapser" /> </a> <div class="s-user-card--info"> <a href="../../users/186608/narcolapser" class="s-user-card--link">Narcolapser</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">5,895</li> <li class="s-award-bling s-award-bling__gold" title="15 gold badges">15</li> <li class="s-award-bling s-award-bling__silver" title="46 silver badges">46</li> <li class="s-award-bling s-award-bling__bronze" title="56 bronze badges">56</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-1982996"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>2</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status "> <strong>2</strong> answers </div> </div> </div> <div class="summary"> <h3><a href="../../questions/1982996/ruby-why-feednormalizer-usage-breaks-classifier-crm114" class="question-hyperlink">Ruby, why FeedNormalizer usage breaks Classifier::CRM114</a></h3> <div class="excerpt">Just learning Ruby and found something bizarre (at least for ansi-c programmer). Having Mac OS X 10.6.2, ruby 1.8.7 (2008-08-11 patchlevel 72) [universal-darwin10.0], feed-normalizer 1.5.1 and crm114 1.0.3 require 'rubygems' require 'crm114' require…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/ruby" class="post-tag grid--cell" title="show questions tagged 'ruby'" rel="tag">ruby</a> <a href="../../questions/tagged/classification" class="post-tag grid--cell" title="show questions tagged 'classification'" rel="tag">classification</a> <a href="../../questions/tagged/feedparser" class="post-tag grid--cell" title="show questions tagged 'feedparser'" rel="tag">feedparser</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Dec 30 '09 at 23:05">asked Dec 30 '09 at 23:05</time> <a href="../../users/241239/katve" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/241239.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Katve" /> </a> <div class="s-user-card--info"> <a href="../../users/241239/katve" class="s-user-card--link">Katve</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">328</li> <li class="s-award-bling s-award-bling__silver" title="2 silver badges">2</li> <li class="s-award-bling s-award-bling__bronze" title="9 bronze badges">9</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-1967801"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>2</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status answered-accepted"> <strong>2</strong> answers </div> </div> </div> <div class="summary"> <h3><a href="../../questions/1967801/parse-facebook-feed-datetime-in-python" class="question-hyperlink">Parse Facebook feed datetime in python?</a></h3> <div class="excerpt"> I am reading a Facebook updates feed using the python library 'feedparser'. I loop through the collection of entries in my Django templates, and display the results. The updated field is returned in a big long string, of some format I am unfamiliar…</div> <div class="grid ai-start jc-space-between fw-wrap"> <div class="grid gs4 fw-wrap tags "> <a href="../../questions/tagged/python" class="post-tag grid--cell" title="show questions tagged 'python'" rel="tag">python</a> <a href="../../questions/tagged/django" class="post-tag grid--cell" title="show questions tagged 'django'" rel="tag">django</a> <a href="../../questions/tagged/datetime" class="post-tag grid--cell" title="show questions tagged 'datetime'" rel="tag">datetime</a> <a href="../../questions/tagged/facebook" class="post-tag grid--cell" title="show questions tagged 'facebook'" rel="tag">facebook</a> <a href="../../questions/tagged/feedparser" class="post-tag grid--cell" title="show questions tagged 'feedparser'" rel="tag">feedparser</a> </div> <div class="started mt0"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Dec 28 '09 at 02:56">asked Dec 28 '09 at 02:56</time> <a href="../../users/2627/rmontgomery429" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/2627.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="rmontgomery429" /> </a> <div class="s-user-card--info"> <a href="../../users/2627/rmontgomery429" class="s-user-card--link">rmontgomery429</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">14,660</li> <li class="s-award-bling s-award-bling__gold" title="17 gold badges">17</li> <li class="s-award-bling s-award-bling__silver" title="61 silver badges">61</li> <li class="s-award-bling s-award-bling__bronze" title="66 bronze badges">66</li> </ul> </div> </div> </div> </div> </div> </div> </div> <div class="mln24"> <div class="question-summary" id="question-summary-16849617"> <div class="statscontainer"> <div class="stats"> <div class="vote"> <div class="votes"> <span class="vote-count-post"><strong>2</strong></span> <div class="viewcount">votes</div> </div> </div> <div class="status "> <strong>1</strong> answer </div> </div> </div> <div class="summary"> <h3><a href="../../questions/16849617/parsing-rss-in-python" class="question-hyperlink">Parsing RSS in Python</a></h3> <div class="excerpt">I am trying to parse rss feed using python. The rss feed has the format: <rss xmlns:media="http://search.yahoo.com/mrss/" version="2.0"> <channel> <title>Yahoo! News - Latest News & Headlines
Manas Paldhe
  • 766
  • 1
  • 10
  • 32
2
votes
1 answer

Parsing out all timestamps in RSS feed using feedparser

I am fairly new to the feedparser lib in Python. Trying to parse out a complete list of timestamps from a RSS feed, I currently have: import feedparser from time import gmtime, strftime d =…
Jacob Irwin
  • 153
  • 1
  • 13
2
votes
2 answers

IRC bot make loop sleep without interrup the main loop

i been trying to code an IRC bot, while i have succeed. I am having problems implementing something i want to do. the code works fine, but i have issues in the following: since the bot uses a While loop to read commands from the IRC when i add a…
Slightz
  • 67
  • 9
1
vote
1 answer

Django and Feedparser - Cannot parse URLs queried from model

Basically , i am saving few feed urls in Django model and for parsing, the urls that i retrieve from model, it is not parsed. Below is how i am trying to query model and parsing a url using feedparser. >>> from bit.models import * >>> url =…
Anshuma
  • 3,242
  • 2
  • 16
  • 14
1
vote
1 answer

Tracking when several callbacks are complete in node.js with Mongoose and FeedParser

I'm writing a batch process to read an RSS feed, and store the contents in MongoDB via Mongoose. I would run the script, and it would process the contents just fine... but the script wouldn't return to the console. My hypothesis was that my database…
adamb0mb
  • 1,401
  • 1
  • 12
  • 15
1
vote
1 answer

Force feedparser to sanitize on all content types

For a project, I want to use feedparser. Basicly I got it working. In the documentation section about sanitization is described, that not all content types are sanitized. How can I force feedparser to do this on all content types?
Martin
  • 4,170
  • 6
  • 30
  • 47
1
vote
4 answers

RSS/Python - Parsing Single Image URL

I'm in the works of learning to parse xml and rss feeds correctly and have run in to a little problem. I'm using feedbarser in python to parse a specific entry from an RSS feed, but can't figure out how to parse just a single img src from the…
user1130601
  • 237
  • 1
  • 4
  • 15
1
vote
0 answers

Validate a RSS Feed in a java program which uses org.apache.commons.feedparser for parsing Feeds

I our application we are using apache commons feedparser API, Which works well however i am intending to validate a Feed URL before parsing it. below are the three kind of exceptions that we are using in case of any exception occurs during…
vaibhav
  • 3,929
  • 8
  • 45
  • 81
1
vote
1 answer

XML::RSS::Parser and Facebook RSS feed ...

I need a subroutine which should parse "any" RSS feed passed to it. I was using XML::RSS:Parser a few times already for some RSS feed but it does not work with Facebook. Example code: use LWP::Simple; use XML::RSS::Parser; my $url = join '',…
1
vote
1 answer

To parse with the help of atom parsing

i have to parse this type of xml.. blah..blah...blah blah..blah...blah How can i parse it by atom parsing (root…
zaiff
  • 999
  • 2
  • 13
  • 29