0

I have an XML file that looks like this :

<booklist>
   <book type="technical">
      <author>Book 1 author 1</author>
      <author>Book 1 author 2</author>
      <title>Book 1 title</title>
      <isbn>Book1ISBN</isbn>
   </book>
   <book type="fiction">
      <author>Book 2 author 1</author>
      <author>Book 2 author 2</author>
      <title>Book 2 title</title>
      <isbn>Book2ISBN</isbn>
   </book>
   <book type="technical">
      <author>Book 3 author 1</author>
      <author>Book 3 author 2</author>
      <author>Book 3 author 3</author>
      <title>Book 3 title</title>
      <isbn>Book3ISBN</isbn>
   </book>
</booklist>

I sort the XMLin by type - so the XML::Simple. I though that this would be a good way to do it. Organize each book by it type.

/tmp/walt $ cat bookparse_by_attrib.pl_dump
#!/usr/bin/perl
use strict ;
use warnings ;
use XML::Simple ;
use Data::Dumper ;
my $book = ();

my $booklist = XMLin('book.xml_with_attrib', KeyAttr => {book => 'type'});
#print Dumper($booklist);
print $booklist->{book}->{technical}->{title}  . "\n";


/tmp/walt $ ./bookparse_by_attrib.pl_dump
$VAR1 = {
          'book' => {
                    'technical' => {
                                   'author' => [
                                               'Book 3 author 1',
                                               'Book 3 author 2',
                                               'Book 3 author 3'
                                             ],
                                   'title' => 'Book 3 title',
                                   'isbn' => 'Book3ISBN'
                                 },
                    'fiction' => {
                                 'author' => [
                                             'Book 2 author 1',
                                             'Book 2 author 2'
                                           ],
                                 'title' => 'Book 2 title',
                                 'isbn' => 'Book2ISBN'
                               }
                  }
        };

this will print out :

print $booklist->{book}->{technical}->{title}  . "\n";
/tmp/walt $ ./bookparse_by_attrib.pl_dump
Book 3 title

so it works when I know the type name however this throws an error :

print $booklist->{book}->{type}->{title}  . "\n";
Use of uninitialized value in concatenation (.) or string at ./bookparse_by_attrib.pl_dump line 11.

this does not throw an error - however It does not does not print out anything.

#!/usr/bin/perl
use strict ;
use warnings ;
use XML::Simple ;
use Data::Dumper ;
my $book = ();
my $booklist = ();

foreach my $book (@{$booklist->{book}}) {
        print $book->{title} . "\n";
        }

I am trying to print out the types, and it only works out if I know the types. Ultimately, I want to type out the types and the title of book, but for now, If I could just printout the types tath would be great.

capser
  • 2,442
  • 5
  • 42
  • 74

2 Answers2

2

I'm going to repeat what I advised in my answer to your earlier question: dereferencing a XML::Simple hash

Do not use XML::Simple. It is an outdated module that will only lead to continued problems as you attempt to hack it to give the format that you need.

Instead, using XML::LibXML to directly pull the information that it sounds like you want:

use strict;
use warnings;

use List::MoreUtils qw(uniq);
use XML::LibXML;

my $xml = XML::LibXML->load_xml(IO => \*DATA);

my @types = sort +uniq map {$_->textContent()} $xml->findnodes('//book/@type');

for my $type (@types) {
    print "Type = $type\n";

    for my $book ($xml->findnodes("//book[\@type='$type']")) {
        print "  Title = " . $book->findvalue('title') . "\n";
    }
}

__DATA__
<booklist>
   <book type="technical">
      <author>Book 1 author 1</author>
      <title>Book 1 title</title>
      <isbn>Book1ISBN</isbn>
   </book>
   <book type="fiction">
      <author>Book 2 author 1</author>
      <author>Book 2 author 2</author>
      <title>Book 2 title</title>
      <isbn>Book2ISBN</isbn>
   </book>
   <book type="technical">
      <author>Book 3 author 1</author>
      <author>Book 3 author 2</author>
      <author>Book 3 author 3</author>
      <title>Book 3 title</title>
      <isbn>Book3ISBN</isbn>
   </book>
</booklist>

Outputs:

Type = fiction
  Title = Book 2 title
Type = technical
  Title = Book 1 title
  Title = Book 3 title
Community
  • 1
  • 1
Miller
  • 34,962
  • 4
  • 39
  • 60
  • That is good knowledge that XML::Simple does not work any more - however this is the module that they use at this shop, and installing another module involves submissions to multiple departments, testing by QA, buying lunch for a sysadm, and them persuading my line mananger for his approval. I will work on getting XML::LibXML, but nor now I have to work with XML::Simple - bug and all – capser Jul 28 '14 at 12:33
  • you downvoted my question - the question was for XML::Simple, not for XML::libXML. I really dont care about the downvote. I think in the future I will put in the preable - suggesting another perl module wont solve this particular query. – capser Jul 28 '14 at 12:44
  • I do not downvote a question unless it does not show proper effort to solve and/or explain the problem. Someone else might've downvoted because of posting such a similar question within 2 hours, but I honestly can't deduce why some people up/down vote and therefore mostly don't worry about it. – Miller Jul 28 '14 at 16:24
  • 1
    I want to thank you. I went to the sysAdmin to ask to install XML::libXML and he was like - "yeah, XML::Simple really does not work, I have been meaning to replace it with XML::libXML for a while ." It is normally a long process, replete with paperwork, cost benefits and numerous approvals, but he completed it right away without hesitation. Thank you again. – capser Jul 29 '14 at 18:38
  • I am so happy for you. That's the best news I'll likely hear on week on these forums. I hope you enjoy your increased productivity and reduced headaches. ;) – Miller Jul 29 '14 at 21:13
1

The structure of the key "book" is a hash reference, however you're treating it as an array reference (@{$booklist->{book}}).

A general problem you're going to run into with the way this data is structured is that it's 100% hashes. Once you have two books of the same type, you'll only get the last book listed for each type.

#!/usr/bin/perl
use warnings;
use strict;

my $booklist = {
    'book' => {
        'technical' => {
            'author' => [
                'Book 3 author 1',
                'Book 3 author 2',
                'Book 3 author 3'
            ],
            'title' => 'Book 3 title',
            'isbn' => 'Book3ISBN'
        },
        'fiction' => {
            'author' => [
                'Book 2 author 1',
                'Book 2 author 2'
            ],
            'title' => 'Book 2 title',
            'isbn' => 'Book2ISBN'
        }
    }
};

for my $book_type ( keys %{ $booklist->{book} } ) {
    printf( "Title: %s\n", $booklist->{book}->{$book_type}->{title} );
}
SymKat
  • 841
  • 5
  • 5