5

I'm trying to change the value of several nodes in a very big XML file loaded in memory from a web form.

The file is obtained like this:

let $file := xdmp:get-request-field("xml_to_upload")

So, as you can see the file is in memory.

Now, I need to change the value of thousands of nodes, and so far I haven't been able to do it in an optimal way.

Any ideas?

Some of the things I've tryied so far:

let $auxVar :=
        if($fileStructureIsValid) then
        (
            for $currentNode in xdmp:unquote($file)//ID

            let $log := xdmp:log( fn:concat( "newNodeValue", ": ", mem:replace( $currentNode, element ID{ fn:concat( $subject, "-", fn:data( $currentNode ) ) } ) ) )

                return fn:concat( $subject, "-", fn:data( $currentNode ) )
        )
        else
        (

        )

The mem library is a custom downloaded one.

BenMorel
  • 34,448
  • 50
  • 182
  • 322

1 Answers1

4

If possible, insert the document into the database, and in a separate transaction update the nodes using xdmp:node-replace.

xquery version "1.0-ml";
...
xdmp:document-insert('file.xml', $file) ;

xquery version "1.0-ml";

for $currentNode in doc('file.xml')//ID
return xdmp:node-replace($currentNode,
  element ID{ concat($subject, "-", $currentNode) });

Alternatively, if you have to update the document in memory, it is probably more optimal walking the tree only once (making your updates all in that operation), rather than multiple mem:replace operations (which probably each re-walk the tree).

declare function local:update-ids(
  $n as item(),
  $subject as xs:string
) as item()
{
  typeswitch ($n)
    case element(ID) return 
      element ID { concat($subject, "-", $n) }
    case element() return
      element { node-name($n) } {
        @*, $n/node()/local:update-ids(., $subject) }
    default return $n
};

let $xml := xdmp:unquote($file)
let $xml-with-updated-ids := local:update-ids($xml, $subject)
...

Update:

As Erik suggests in the comments, you can also write the logic of local:update-ids in XSLT (using xdmp:xslt-eval or xdmp:xslt-invoke to execute), and they should be roughly equivalent in terms of performance. In fact, MarkLogic has a really well written blog entry on the subject:

http://developer.marklogic.com/blog/tired-of-typeswitch

wst
  • 11,681
  • 1
  • 24
  • 39
  • 3
    For completeness, a third alternative is to apply an XSLT transform to the document in memory. – ehennum Sep 04 '13 at 21:39