3

I was wondering if you guys had any tips which repository implementation has good clustering and horizontal scaling characteristics on common hardware?

The problem is that we have to implement a preservation system on top of a repository which is be able to ingest and manage LOTS of heterogeneous data (> 500 TB) with big files (>50GB).

Fedora Commons it seems can only be clustered by using a distributed filesystem. Apache Jackrabbit can be clustered but its DataStore (for large binary data) has to be the same for all nodes in a clustered environment. Do you guys have any tips which repository systems I should check out?

javanna
  • 59,145
  • 14
  • 144
  • 125
fasseg
  • 17,504
  • 8
  • 62
  • 73

1 Answers1

5

Give ModeShape a try. It is a JCR 2.0 implementation that can be configured to use an Infinispan data grid as its backing store, and ModeShape is also easily clustered (it uses JGroups, which is the same communication library used in the clustering features in Infinispan and JBoss Application Server, among many others).

Randall Hauch
  • 7,069
  • 31
  • 28