-1

I have created a REST API that receives a HTTP POST request(array of json requests) from UI. This would trigger a Xquery code which would spawn the requests to execute some functionalities and it may 10-30 mins to get completed. The max request count is 1000.

Please find the outline of the code below

declare function local:upload-record($req-xml, $chunk-size,$upload-uri){
    if (exists($req-xml))
    then (
    let $log := xdmp:log("Upload started")
    let $req-count := fn:count($req-xml)
     let $response := 
     (for $req in $req-xml[1 to $chunk-size] 
                     let $user-name := $req/createdBy/text()
                     let $fetch-url := fn:concat("http://",$get-host:host,":{port}/fetchRecord)
                     let $auto-fetch := doc-lib:fetch-record($fetch-url)
                     let $doc-id := $auto-fetch[2]/envelope/doc-id/text()
                     let $auto-save := doc-lib:save-record($req,$doc-id)
                     let $publish-url := fn:concat("http://",$get-host:host,":{port}/publishRecord")
                     let $auto-publish := doc-lib:publish-record($publish-url,$doc-id,$user-name) 

return  $auto-publish

                     ,
                      xdmp:spawn-function(function(){ 
                        local:upload-record(subsequence($req-xml, $chunk-size+1), $chunk-size,$upload-uri) 
                          }, 
                        <options xmlns="xdmp:eval">
                        <result>true</result>
                        
                        <update>true</update>
                        <commit>auto</commit>
                        </options>))
                   
     return $response)
     else xdmp:log("Job completed successfully")
};

let $req := xdmp:get-request-body("json")/node()

let $config := json:config("custom")
let $req-xml := json:transform-from-json($req,$config)
let $chunk-size := 10

let $resp := xdmp:spawn-function(function() {local:upload-record($req-xml, $chunk-size,$bulk-upload-uri)},  <options xmlns="xdmp:eval"><result>true</result></options>)
  
return <response>Upload triggered successfully</response>

If there is an occurrence of error, say, timeout error, which stops the request processing at the mid of the task ,I need to report it to the UI that the processing is stopped due to error and provide the partial results to the UI. So ,Is it possible to use try/catch when using spawn function? If so, how can we do it?

Antony
  • 183
  • 8
  • Your UI is going to wait around for 10-30 minutes for the result? If you are willing to wait that long, why not bump the execution time to 60 minutes or more? Seems that you are trying to do really large batch work in a single transaction and maybe you could instead break up that set of work and do things differently or just be prepared to wait longer. – Mads Hansen Aug 02 '22 at 14:57
  • @Mads Hansen The mentioned API call just triggers the process and return a message Upload triggered successfully immediately within milliseconds to indicate that request is successfully reached to ML. But, the background task takes about 10-30 mins as there are internal calls made and if there is any error that stops the process in the background, I'm not able to track it. – Antony Aug 02 '22 at 16:07

1 Answers1

1

OK. I have given a sample below about one strategy that you could take:

Have a record in the system that helps track the work under the hood.

declare namespace my-lib = "http://www.example.com/my-lib";

(:Record a process step against the ID
  Note - this function is "quick", but is a separate transaction and causes calling code (all 5 chunk processors) to queue in order to update.
  If you log every step, it could slow the system down and create huge XML files.  Tens of entries, OK. hundred, probably OK. 
  There are other strategies that could be used. However, I keep the lock for only the update/insert time (hence the eager evaluation of the entry in advance of the invoke-function)
:)

declare function my-lib:record-process-step($id, $status, $message, $data){
  let $uri := "/my-logs/" || $id
    let $entry := xdmp:eager(my-lib:generate-entry($id, $status, $message, $data))
    let $_ := xdmp:invoke-function(function(){if(fn:exists(fn:doc($uri)))
         then xdmp:node-insert-child(fn:doc($uri)/my-lib:log, $entry)
         else xdmp:document-insert($uri, <my-lib:log>{$entry}</my-lib:log>)
       })
  return $uri
};

(: Generate a single record for a process step. Note the XML payload $data. I did this because different steps may have different information to pass:)
declare function my-lib:generate-entry($id as xs:string, $status as xs:string, $message as xs:string, $data as element()){
  element my-lib:entry {
    attribute id {$id},
    attribute status {$status},
    attribute message  {$message},
    attribute timstamp {fn:current-dateTime() + xdmp:elapsed-time()},
    $data
  }
};

(:For this sample, a unique ID:)
let $id := sem:uuid-string()

(:Mimic your chunking  - I chose 5 as a sample :)
let $number-of-chunks := 5 

(:First entry -initialization  - lets us know how many chunks - payload     details are the context of "Initialize Processing" :)
let $_ := my-lib:record-process-step($id, "intialize-processing", "starting processing of all chunks", <details><total-chunks>{$number-of-chunks}</total-chunks></details>)

(: Mimic the processing of chunks:)
let $_ := for $chunk-number in (1 to $number-of-chunks)
  return xdmp:spawn-function(function(){
    
    (:Messag about starting a chunk. - details reference this chunk:)
    let $_ := my-lib:record-process-step($id, "chunk-start", "starting processing of chunk " ||  $chunk-number, <details><chunk-number>{$chunk-number}</chunk-number></details>)

    (: Mimic doing some work  - variable amount of stuff per chunk to get some interesting records back:)
    let $_ := for $x in (1 to xdmp:random(5))

      (: Mimic the work taking different amounts of time so that the log order is all random per chunk
       - log the work update - for sample only - see notes at top about how much to log. I would have this log off unless I really needed intermediate information.:)
      let $_ := xdmp:sleep((xdmp:random(5)+5)*1000) 
      let $_ := my-lib:record-process-step($id, "chunk-process-update", "I did something interesting in chunk " ||  $chunk-number, <details><chunk-number>{$chunk-number}</chunk-number><add-some-context-here/></details>)
      return ()

    (:Log when a chunk is done:) 
    let $_ := my-lib:record-process-step($id, "chunk-end", "end processing of chunk " ||  $chunk-number, <details><chunk-number>{$chunk-number}</chunk-number></details>)
    return ()
  })

return <response>{$id}</response>

Example Response:

<response>922116fa-2b0f-4f1e-ac5d-97a8b69dcec0</response>
enter code here

Log entry for /my-logs/922116fa-2b0f-4f1e-ac5d-97a8b69dcec0 is below. You will not that the entries are in various orders per chunk and that the chunks do not finish in time. However, some critical datapoints:

  • one entry about total chunks for this ID
  • 5 start entries - one for each chunk -- Note: They are not guaranteed to start next to each other or right away - they may be queued on the task queue depending on other work competing for the task server
  • 5 End entries out of order in terms of batch ID -by design in sample

You could add additional functions to the library to check on status like:

  • total chunks - count of started = queued
  • total chunks > count of chunks completed = not done
  • Processing time completed = if all chunks, then last timestamp of the chunk complete message
  • etc....

This simple functions, if exposed to your calling system could then be invoked externally to keep the other system (or browser) up to date on current process. This sample code only security is an important consideration that this sample does not cover. It also does not cover error situations, but strategically placed try/catch could log many of those as a different entry type.

`

<my-lib:log xmlns:my-lib="http://www.example.com/my-lib">
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="intialize-processing" message="starting processing of all chunks" timstamp="2022-08-08T14:15:13.7686171+02:00">
<details>
<total-chunks>5</total-chunks>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-start" message="starting processing of chunk 1" timstamp="2022-08-08T14:15:13.824793+02:00">
<details>
<chunk-number>1</chunk-number>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-start" message="starting processing of chunk 2" timstamp="2022-08-08T14:15:13.8360725+02:00">
<details>
<chunk-number>2</chunk-number>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-end" message="end processing of chunk 2" timstamp="2022-08-08T14:15:13.8370602+02:00">
<details>
<chunk-number>2</chunk-number>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-start" message="starting processing of chunk 3" timstamp="2022-08-08T14:15:13.8509333+02:00">
<details>
<chunk-number>3</chunk-number>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-start" message="starting processing of chunk 5" timstamp="2022-08-08T14:15:13.8666333+02:00">
<details>
<chunk-number>5</chunk-number>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-start" message="starting processing of chunk 4" timstamp="2022-08-08T14:15:13.8512207+02:00">
<details>
<chunk-number>4</chunk-number>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 5" timstamp="2022-08-08T14:15:19.8824323+02:00">
<details>
<chunk-number>5</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 4" timstamp="2022-08-08T14:15:19.9916399+02:00">
<details>
<chunk-number>4</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 1" timstamp="2022-08-08T14:15:20.8374943+02:00">
<details>
<chunk-number>1</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 3" timstamp="2022-08-08T14:15:22.8608729+02:00">
<details>
<chunk-number>3</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 3" timstamp="2022-08-08T14:15:28.8780495+02:00">
<details>
<chunk-number>3</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 5" timstamp="2022-08-08T14:15:28.8940915+02:00">
<details>
<chunk-number>5</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 4" timstamp="2022-08-08T14:15:29.9991982+02:00">
<details>
<chunk-number>4</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 1" timstamp="2022-08-08T14:15:30.8416467+02:00">
<details>
<chunk-number>1</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-end" message="end processing of chunk 1" timstamp="2022-08-08T14:15:30.844275+02:00">
<details>
<chunk-number>1</chunk-number>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 4" timstamp="2022-08-08T14:15:37.0065493+02:00">
<details>
<chunk-number>4</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 3" timstamp="2022-08-08T14:15:38.8920007+02:00">
<details>
<chunk-number>3</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 5" timstamp="2022-08-08T14:15:38.9078871+02:00">
<details>
<chunk-number>5</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 4" timstamp="2022-08-08T14:15:45.0274099+02:00">
<details>
<chunk-number>4</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 3" timstamp="2022-08-08T14:15:46.9046038+02:00">
<details>
<chunk-number>3</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 5" timstamp="2022-08-08T14:15:47.9183383+02:00">
<details>
<chunk-number>5</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-end" message="end processing of chunk 5" timstamp="2022-08-08T14:15:47.9213179+02:00">
<details>
<chunk-number>5</chunk-number>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 4" timstamp="2022-08-08T14:15:51.0422546+02:00">
<details>
<chunk-number>4</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-end" message="end processing of chunk 4" timstamp="2022-08-08T14:15:51.0440896+02:00">
<details>
<chunk-number>4</chunk-number>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-process-update" message="I did something interesting in chunk 3" timstamp="2022-08-08T14:15:51.9137371+02:00">
<details>
<chunk-number>3</chunk-number>
<add-some-context-here>
</add-some-context-here>
</details>
</my-lib:entry>
<my-lib:entry id="922116fa-2b0f-4f1e-ac5d-97a8b69dcec0" status="chunk-end" message="end processing of chunk 3" timstamp="2022-08-08T14:15:51.9168928+02:00">
<details>
<chunk-number>3</chunk-number>
</details>
</my-lib:entry>
</my-lib:log>

`