5

I have an XML file with the following line:

            <VALUE DECIMAL_VALUE="0.2725" UNIT_TYPE="percent"/>

I would like to increment this value by .04 and keep the format of the XML in place. I know this is possible with a Perl or awk script, but I am having difficulty with the expressions to isolate the number.

brian d foy
  • 129,424
  • 31
  • 207
  • 592
DC.
  • 53
  • 3

5 Answers5

4

If you're on a box with the xsltproc command in place I would suggest you use XSLT for this.

For a Perl solution I'd go for using the DOM. Check this DOM Processing with Perl article out.

That said. If your XML file is produced in a predictable way something naïve like the following could work:

perl -pe 's#(<VALUE DECIMAL_VALUE=")([0-9.]+)(" UNIT_TYPE="percent"/>)#"$1" . ($2 + 0.4) . "$3"#e;'
PEZ
  • 16,821
  • 7
  • 45
  • 66
3

If you are absolutely sure that the format of your XML will never change, that the order of the attributes is fixed, that you can indeed get the regexp for the number right... then go for the non-parser based solution.

Personally I would use XML::Twig (maybe because I wrote it ;--). It will process the XML as XML, while still respecting the original format of the file, and won't load it all in memory before starting to work.

Untested code below:

#!/usr/bin/perl
use strict;
use warnings;

use XML::Twig;

XML::Twig->new( # call the sub for each VALUE element with a DECIMAL_VALUE attribute
                twig_roots => { 'VALUE[@DECIMAL_VALUE]' => \&upd_decimal },
                # print anything else as is
                twig_print_outside_roots => 1,
              )
         ->parsefile_inplace( 'foo.xml');

sub upd_decimal
  { my( $twig, $value)= @_; # twig is the XML::Twig object, $value the element
    my $decimal_value= $value->att( 'DECIMAL_VALUE');
    $decimal_value += 0.4;
    $value->set_att( DECIMAL_VALUE => $decimal_value);
    $value->print;
  }
PEZ
  • 16,821
  • 7
  • 45
  • 66
mirod
  • 15,923
  • 3
  • 45
  • 65
2

This takes input on stdin, outputs to stdout:

while(<>){
 if( $_ =~ /^(.*DECIMAL_VALUE=\")(.*)(\".*)$/ ){
  $newVal = $2 + 0.04;
  print "$1$newVal$3\n";
 }else{
  print $_;
 }
}
Paul
  • 6,435
  • 4
  • 34
  • 45
0

here's gawk

awk '/DECIMAL_VALUE/{
 for(i=1;i<=NF;i++){
    if( $i~/DECIMAL_VALUE/){
        gsub(/DECIMAL_VALUE=|\042/,"",$i)
        $i="DECIMAL_VALUE=\042"$i+0.4"\042"
    }
 }
}1' file
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
0

Something akin to the following will work. It may need tweaking if there is extra spacing, but that is left as an exercise for the reader.

function update_after(in_string, locate_string, delta) {
    local_pos = index(in_string,locate_string);
    leadin    = substr(in_string,0,local_pos-1);
    leadout   = substr(in_string,local_pos+length(locate_string));
    new_value = leadout+delta;
    quote_pos = index(leadout,"\"");
    leadout   = substr(leadout, quote_pos + 1);
    return leadin locate_string new_value"\"" leadout;
}

/^ *\<VALUE/{
    print  update_after($0, "DECIMAL_VALUE=\"",0.4);
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Richard Harrison
  • 19,247
  • 4
  • 40
  • 67