I am currently trying to create a perl script that uses LibXML to process data in an SVG font.
In an SVG font, each character is defined as a glyph element with an unicode attribute that defines its unicode address in the form of a unicode entity; like so:
<glyph unicode=" " />
Part of want I want to do is take the value of each glyph element's unicode attribute and process it like a string. However, when I use Element->getAttribute('unicode'); against a glyph node, it returns a "wide character" that displays as a placeholder rectangle, leading me to believe that it expands the unicode entity into a unicode character and returns that.
When I create my parser, I set expand_entities to 0, so I am not sure what else I could do to prevent this. I am rather new with XML processing, so I'm not sure I actually understand what's going on or if this is even supposed to be preventable.
Here is a code sample:
use utf8;
use open ':std', ':encoding(UTF-8)';
use strict;
use warnings;
use XML::LibXML;
$XML::LibXML::skipXMLDeclaration = 1;
my $xmlFile = $ARGV[0];
my $parser = XML::LibXML->new();
$parser->load_ext_dtd(0);
$parser->validation(0);
$parser->no_network(1);
$parser->recover(1);
$parser->expand_entities(0);
my $xmlDom = $parser->load_xml(location => $xmlFile);
my $xmlDomSvg = XML::LibXML::XPathContext->new();
$xmlDomSvg->registerNs('svg', 'http://www.w3.org/2000/svg');
foreach my $myGlyph ($xmlDomSvg->findnodes('/svg:svg/svg:defs/svg:font/svg:glyph', $xmlDom))
{
my $myGlyphCode = $myGlyph->getAttribute('unicode');
print $myGlyphCode . "\n";
}
Note: If I run print $myGlyph->toString();, the unicode entity in the output is not expanded, hence why I'm concluding that the expansion is happening in the getAttribute method.