I have XML files that look like this:
<?xml version="1.0" encoding="UTF-8"?>
<!-- some comment here -->
<rsccat version="1.0" locale="en_US" product="some_prouduct" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../../../../product/resources/schema/msgcat.xsd">
<message>
<entry key="entry1" lol="false">
<![CDATA[
<actions>
<action id="hmm" type="nothing">
<cmd>456</cmd>
<msg id="123"></msg>
</action>
</actions>
]]>
</entry>
<entry key="entry2">message2 </entry>
<entry key="entry3">message3 </entry>
<entry key="entry4">
<actions hello="yes">
<action type="lol">
<cmd>rolf</cmd>
<txt>omg</txt>
</action>
</actions> </entry>
</message>
</rsccat>
I would like to write a function in Perl which takes in the path of an XML file, and a list of keys to be removed, and removes the entries associated with those keys entirely, without leaving any white spaces or blank lines. Moreover, I would like that the existing blank lines in the original XML files are preserved, for instance, the three blank lines after the entry with key entry4
.
I have written a function which removes the entries without leaving any blank lines, but it also removes the existing blank lines in the XML file.
use File::Slurp;
sub findReplaceFile
{
my ($filename, @keys) = @_;
my $filetext = read_file($filename);
foreach my $key (@keys)
{
chomp($key); # remove newline characters
my $regex = qr/<entry\s+key\s*=\s*"${key}".*?>.+?<\/entry>/s;
$filetext =~ s/$regex//gs; # replacing with empty string
$filetext =~ s/\n\s*\n/\n/g; # removing extra line
}
}
Please help me with my goal, I am fine with both the XML Parser module in Perl as well as plain old regex.