0

Given this RSS content, which features the non-standard namespace 'torznab':

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:torznab="http://torznab.com/schemas/2015/feed">
  <channel>
    <title>AggregateSearch</title>
    <description>This feed includes all configured trackers</description>
    <link>http://127.0.0.1/</link>
    <language>en-US</language>
    <category>search</category>
    <item>
      <title>Yellowjackets.S02E08.It.Chooses.720p.AMZN.WEBRip.DDP5.1.x264-NTb[rartv]</title>
      <guid>https://rarbg.to/infohash/4ce37b2cd4157349a6eb2e6100513e8946fbdcac</guid>
      <jackettindexer id="rarbg">RARBG</jackettindexer>
      <type>public</type>
      <comments>https://torrentapi.org/redirect_to_info.php?token=0186pk4m73</comments>
      <pubDate>Fri, 19 May 2023 23:21:30 +1000</pubDate>
      <size>920749041</size>
      <description />
      <link>magnet:?xt=urn:btih:4ce37b2cd4157349a6eb2e6100513e8946fbdcac</link>
      <category>5040</category>
      <category>100041</category>
      <enclosure url="magnet:?xt=urn:btih:4ce37b2cd4157349a6eb2e6100513e8946fbdcac" length="920749041" type="application/x-bittorrent" />
      <torznab:attr name="seeders" value="279" />
      <torznab:attr name="peers" value="353" />
    </item>
  </channel>
</rss>

If I want to parse it using XML::RSS and get access to those <torznab:attr /> elements, I should be able to do it by adding that namespace, as in this minimal script:

use XML::RSS;
use Data::Dumper;
my $rss = XML::RSS->new();
# add the special namespace before parsing
$rss->add_module(
    prefix => 'torznab',
    uri    => "http://torznab.com/schemas/2015/feed"
);
$rss->parsefile("torznab.xml");
say Dumper($rss);
say Dumper( $rss->{items}->[0] );

But when I do that, the elements just aren't there in the output.

What am I missing? TIA.

AmbroseChapel
  • 11,957
  • 7
  • 46
  • 68
  • 1
    Some poking around in the XML::RSS source indicates it just ignores those namespaced attr elements when parsing the document. They're skipped and not added to the rss object's data at all. – Shawn May 20 '23 at 07:09
  • 1
    Might want to file a bug report. – Shawn May 20 '23 at 07:11
  • 1
    For the record I solved my problem by switching to XML::RSSLite, https://metacpan.org/pod/XML::RSSLite which is a "relaxed" parser. It's not a pure XML parser, it's "less concerned with XML compliance" than the other modules but it's working for me. – AmbroseChapel May 21 '23 at 01:30

0 Answers0