4

I'd like to know your thoughts about using these two CMS on Liferay. I know, that jackrabbit is rather a framework and reference JCR implementation. I'm more interested in the situation, when you have Liferay portlet and you need a CMS repository other then the liferay Document Library, because you need more features.

What I am concerned about:

  • Level of Metadata Extraction from various document formats ( I see that both are using Apache tika parsers)

  • Level of Content Transformation - for instance dealing with not quite valid PDFs (OCR)

  • How easily can developer extend functionality (for instance implementing various actions on document processing)

It takes a lot of time to try both of them, I have to decide on one and stick with it.

Thank you

lisak
  • 21,611
  • 40
  • 152
  • 243

1 Answers1

3

I never did anything serious with Jackrabbit, but did quite a lot of projects with Alfresco.

Since there's an ongoing joint effort between Alfresco and Liferay in order to provide a solid and validated integration, Alfresco should at least minimize the integration efforts between the two applications, and possibly have a good starting point for your project.

From the functional point of view, the following apply to Alfresco:

  • as you noted, Alfresco makes use of Tika for metadata extraction. By default a number of document types are supported, and adding your own custom metadata extractor is quite easy and well documented.

  • Alfresco will make use of Tika for transformations when project Swift (an upcoming version) will be released. As per now, tools like pdfbox and OpenOffice are sitting behind content transformations, which provide good reliability for the average case.

  • offering extension points for the repository is something Alfresco is quite good at: you can hook you code upon events on specific content types, configure rules on folders that get triggered upon creation/update/delete of their inner content and so on

skuro
  • 13,414
  • 1
  • 48
  • 67