Extracting text values from atom feed with Ruby RSS

Question

I'm trying to use the standard lib ruby RSS::Parser to parse an Atom feed, which sort of works.

When I access the extracted fields, such as .title it returns <title>The title</title> rather than just The title. If you parse e.g. a RSS feed the .channel.title will return The title.

Is there any way to use the standard RSS::Parser for atom feeds? or is it a bug?

I know there are alternatives like Feedzirra, but I would rather use the standard lib.

A quick test to see the problem in ruby 1.9.3 and 2.0:

require "rss"
feed = RSS::Parser.parse(open("http://casadelkrogh.dk/atom.xml").read)
feed.title.to_s #=> "<title>CasaDelKrogh</title>"

score 3 · Accepted Answer · answered Oct 04 '13 at 18:32

3

To get the content of the title your code should be as below :

require "rss"
feed = RSS::Parser.parse(open("http://casadelkrogh.dk/atom.xml").read)
feed.title.to_s
# => "<title>CasaDelKrogh</title>"
feed.title.content
# => "CasaDelKrogh"

answered Oct 04 '13 at 18:32

Arup Rakshit

116,827
30
260
317

score 2 · Answer 2 · edited Oct 04 '13 at 18:41

2

It's not a bug.

to_s method is almost inspection of RSS::Atom::Feed::Title.

You can use feed.title.content if you want get title without tag.

edited Oct 04 '13 at 18:41

Arup Rakshit

116,827
30
260
317

answered Oct 04 '13 at 18:29

humbroll

347
1
10

Extracting text values from atom feed with Ruby RSS

2 Answers2