1

here is a sample of my xml.

 let $test :=   
    <root>
        <a z="">stuff</a>
        <b z="12" y="">more stuff</b>
        <c>stuff</c>
        <d z = " " y="0" x ="lkj">stuff goes wild</d>
    </root>

i would like to remove empty attributes using query to get this :

<root>
    <a>stuff</a>
    <b z="12">more stuff</b>
    <c>stuff</c>
    <d y="0" x ="lkj">stuff goes wild</d>
</root>  

I've gotten this far with my query, but i cannot get it to only remove the empty attribute, instead of removing all attributes if there is any empty one inside the element.

declare function local:sanitize ($nodes as node()*) {
for $n in $nodes 
return typeswitch($n)
    case element() return 
        if ($n/@*[normalize-space()='']) then  (element{node-name($n)} {($n/@*[.!=''], local:sanitize($n/node()))})
        else (element {node-name($n)} {($n/@*, local:sanitize($n/node()))})
default return ($n)
};

The function needs to be performant hence my desire to use typeswitch. I feel i m close but the last step seems to elude me. ie. z = " " doesn't get caught. Thanks for the help.

duncdrum
  • 723
  • 5
  • 13

1 Answers1

2

The problem with your code is that when recreating the elements, you're checking for completely empty attributes, not empty attributes after whitespace normalization. Add this, and you're fine.

if ($n/@*[normalize-space()='']) then  (element{node-name($n)} {($n/@*[normalize-space(.)!=''], local:sanitize($n/node()))})

I simplified the pattern and distinguish between attributes, elements and everything else. Filter for empty attributes, recreate elements and just return anything else. The resulting function is much easier to read and understand, and produces the correct output:

declare function local:sanitize ($nodes as node()*) {
for $node in $nodes 
return typeswitch($node)
  (: filter empty attributes :)
  case attribute() return $node[normalize-space(.)]
  (: recreate elements :)
  case element() return 
    element { node-name($node) } {
      (: Sanitize all children :)
      for $child in $node/(attribute(), node())
      return local:sanitize($child)
    }
  (: neither element nor attribute :)
  default return $node
};
Jens Erat
  • 37,523
  • 16
  • 80
  • 96
  • thanks, didn't think to check for attributes in the first case statement. Will this actually impact the order in which nodes are processed? – duncdrum Sep 14 '14 at 20:36
  • No, as it loops over all nodes and handles each one after the other. The only important thing to know is that attributes always come have to be created first, thus also the `$node/(attribute(), node()` part (doing this the other way round would bring up error messages). – Jens Erat Sep 14 '14 at 20:41