0

I'm trying to use the standard lib ruby RSS::Parser to parse an Atom feed, which sort of works.

When I access the extracted fields, such as .title it returns <title>The title</title> rather than just The title. If you parse e.g. a RSS feed the .channel.title will return The title.

Is there any way to use the standard RSS::Parser for atom feeds? or is it a bug?

I know there are alternatives like Feedzirra, but I would rather use the standard lib.

A quick test to see the problem in ruby 1.9.3 and 2.0:

require "rss"
feed = RSS::Parser.parse(open("http://casadelkrogh.dk/atom.xml").read)
feed.title.to_s #=> "<title>CasaDelKrogh</title>"
Dan Lowe
  • 51,713
  • 20
  • 123
  • 112

2 Answers2

3

To get the content of the title your code should be as below :

require "rss"
feed = RSS::Parser.parse(open("http://casadelkrogh.dk/atom.xml").read)
feed.title.to_s
# => "<title>CasaDelKrogh</title>"
feed.title.content
# => "CasaDelKrogh"
Arup Rakshit
  • 116,827
  • 30
  • 260
  • 317
2

It's not a bug.

to_s method is almost inspection of RSS::Atom::Feed::Title.

You can use feed.title.content if you want get title without tag.

Arup Rakshit
  • 116,827
  • 30
  • 260
  • 317
humbroll
  • 347
  • 1
  • 10