0

After a couple of months of using our application (which internally uses Jackrabbit 1.6.4 to store documents), the customer's database (Oracle) already has more than 6 million rows in the VERSION_BUNDLE table - some of our data is using the Jackrabbit versioning feature for multiple instances of the same document. We expect increased usage of the application over the next couple of months/years and therefore also expect an accelerated increase in the data stored in Jackrabbit.

Some of our operations people are worried about the number of records in this table (and the DEFAULT_BUNDLE table as well). Is there a way to safely purge some data from these tables? I guess simply deleting the documents through the Jackrabbit API will not do this, right?

Do we need to be worried about the number of records in the table? What amount of data are other people seeing in their Jackrabbit installations?

nwinkler
  • 52,665
  • 21
  • 154
  • 168

1 Answers1

1

Instead of having multiple instances of the same document why not have it once and then have references to it? We do this by storing the path/identifier of the node as a property of another node so we can easily look it up.

To remove unwanted versions you can use VersionHistory.removeVersion().

I don't know about jackrabbit 1.6 but in 2.4 removing nodes/versions appears to remove corresponding entries from the version/default tables. I don't think I'd want to manually delete entries!

TedTrippin
  • 3,525
  • 5
  • 28
  • 46
  • When I mentioned multiple instances of the same document I meant multiple versions of the same document. So it's not the same actual document, but multiple versions (which include changes). I'll look into the `VersionHistory.removeVersion()` method to see if that does what we want. Agreed about not wanting to do this manually! – nwinkler Jun 20 '12 at 14:52