I have around 43 million documents which is having the latest versioned document in LIVE collection and also have same versioned document in another version collection named as (/collection/versionNumber). I want to delete the versioned collections which is around 34 million. what is best approach to go for it to delete all in one go .
2 Answers
You could try using xdmp:collection-delete()
to delete all documents in the collection in a single transaction.
If that doesn't work and it isn't able to delete in one shot, then I would look to utilize batch tools. For instance, a CoRB job.
An example job options file with properties needed, except for the XCC-CONNECTION-URI
:
# Inline module to select all URIs from the collection
URIS-MODULE=INLINE-XQUERY|let $uris := cts:uris("",(),cts:collection-query("/collection/versionNumber")) return (count($uris), $uris)
# Inline module to delete the docs
PROCESS-MODULE=INLINE-XQUERY|declare variable $URI as xs:string external; xdmp:document-delete($URI)
THREAD-COUNT=10

- 63,927
- 12
- 112
- 147
I think your application is using DLS library for versioning. If yes, and if you never want any version to look into in future, then only delete the versioned documents. Use can use "dls:document-unmanage" API in that case.
Also, explore dls:purge and dls:document-purge before proceeding. I am not very sure of these two.
Anyways, even if it's not DLS, processing them in one go (single transaction) would not be a recommended way. Either process them in batches or set them all in different threads on task server through spawn.

- 21
- 2
-
2we are not using DLS and even i know about spawn thing, this again would take significant amount of time. – Harmanjot Singh Aug 27 '20 at 11:37