0

I'm trying to generate arrays of array from xml string for nested tree structure.

But when I generating reference variable $output few elements inside B1 are missing(C1, D1, D2). $testis generated using XML::LibXML::Readercpan module.

use strict;
use warnings;
use Data::Dumper;
use v5.10;

my $test = "start, /root/class1, A1
start, /root/class1/class2, B1
start, /root/class1/class2/class3, C1
start, /root/class1/class2/class3/class4, D1
end, /root/class1/class2/class3/class4, D1
start, /root/class1/class2/class3/class4, D2
end, /root/class1/class2/class3/class4, D2
end, /root/class1/class2/class3, C1
end, /root/class1/class2, B1
start, /root/class1/class2, B2
start, /root/class1/class2/class3, C2
start, /root/class1/class2/class3/class4, D1
end, /root/class1/class2/class3/class4, D1
start, /root/class1/class2/class3/class4, D2
end, /root/class1/class2/class3/class4, D2
start, /root/class1/class2/class3/class4, D3
end, /root/class1/class2/class3/class4, D3
end, /root/class1/class2/class3, C2
end, /root/class1/class2, B2
end, /root/class1, A1";

our $x = 0;

my $output = generator($test); 

say "Output: ". Dumper $output;

sub generator{
    my ($classes, $x, $subout) = (shift, shift, '');
    my @out;

    $x += 1;

    while($classes =~ /(start(.+?class$x\,\ (\w+))\n(.*?)end\2)/gsi){
        my ($data1, $value, $rest) = ($1, $3, $4);
        $subout = generator($rest,$x) if $rest;
        push @out, $value;

    }
    push @out, $subout if $subout;
#   say "X: $x ". Dumper \@out;
    return \@out;
}

output is :

Output: $VAR1 = [
          'A1',
          [
            'B1',
            'B2',
            [
              'C2',
              [
                'D1',
                'D2',
                'D3'
              ]
            ]
          ]
        ];

am I missing something? Any other method to create data structure also helpful.

xml:

<?xml version="1.0" encoding="UTF-8"?>
<root>
<class1 name="A1">
 <class2 name="B1">
  <class3 name="C1">
   <class4 name="D1">
   </class4>
   <class4 name="D2">
   </class4>
  </class3>
 </class2>
<class2 name="B2">
 <class3 name="C2">
  <class4 name="D1">
  </class4>
  <class4 name="D2">
  </class4>
  <class4 name="D3">
  </class4>
 </class3>
</class2>
</class1>
</root>
waghso
  • 623
  • 7
  • 23

1 Answers1

1

This code will do as you ask. But this is basically what the awful XML::Simple tries to do. It loses information and it's not possible with a general XML document

use strict;
use warnings 'all';

use XML::LibXML::Reader;

use constant XML_FILE => 'root.xml';

my %data;
my @stack = (\%data);

my $reader = XML::LibXML::Reader->new(location => XML_FILE);

while ( $reader->read ) {

    my $type = $reader->nodeType;

    if ( $type == XML_READER_TYPE_ELEMENT ) {
        next unless my $name = $reader->getAttribute('name');
        push @stack, ($stack[-1]{$name} = {});
    }
    elsif ( $type == XML_READER_TYPE_END_ELEMENT ) {
        pop @stack if $reader->getAttribute('name');
    }
}

use Data::Dump;
dd \%data;

output

{
  A1 => {
          B1 => { C1 => { D1 => {}, D2 => {} } },
          B2 => { C2 => { D1 => {}, D2 => {}, D3 => {} } },
        },
}
Borodin
  • 126,100
  • 9
  • 70
  • 144