4

I've been using Liferay a lot for past 2 years, but I have never needed any extensive document management.

Now I have a portlet where users upload documents (MS office OLE2 documents, ODS documents, PDF etc.) and I have to persist them with all metadata available.

I know how would I do that without using Liferay, I'd probably use Apache solr with Apache Tika (UpdateRichDocuments and ExtractingRequestHandler) or Apache Jackrabbit that are using Apache Tika under the hood (org.apache.jackrabbit.extractor.*).

The problem is, that If I look at the trunk of Liferay, there are some key classes :

Hooks (JCRHook, FileSystemHook, CMISHook, s3Hook) that are employed from within DLLocalServiceImpl kinda directly

Another alternative is using DLAppLocalServiceImpl that is employing DLRepositoryLocalServiceImpl and the files are persisted into repository also via Hooks, but a lot of additional stuff is done in there.

  1. There is not jackrabbit-text-extractors library in Liferay, so I suppose If I wanted metadata to be extracted from PDF, DOCs, ODS documents, I would have very hard times... because the DL service layer doesn't accept additional properties

    1. I think I'd have to avoid using DL services and JCR hook and access Jackrabbit directly... But I would loose the compatibility and possibility migrate my repository etc.

Could please anybody collaborate on this one please ? Thank you

lisak
  • 21,611
  • 40
  • 152
  • 243

4 Answers4

2

SOLR for indexing, Jackrabbit for document storage. Managing Liferay Document Library in code is fairly easy, just look at the DL*LocalServiceUtil classes, namely DLFolderLocalServiceUtil and DLFileLocalServiceUtil. By default Liferay just creates a matching folder/file structure on the hard drive (with names changed) so you'd only need to write code or use Jackrabbit if you wanted more than this since Liferay allows up/download and viewing out of the box via the control panel and various portlets.

I haven't used JackRabbit with Liferay but once configured everything should be managed under the covers and you shouldn't need to worry about it on the front end.

When you say "with all metadata available" I'm not sure what is retained, but aside from renaming the file so that it can be tracked there shouldn't be any other changes. It should be quick and easy to test by uploading a file of each type and checking the entries in the LIFERAY/data/document_library directory and subdirectories. Again this would be different if Jackrabbit is used.

David O'Meara
  • 2,983
  • 25
  • 38
  • Thank you David, but with all due respect, your answers don't solve much, because I think that it can be answered only by those who have actually used jackrabbit or alfresco in liferay. The API has changed a lot in 6.x and robust changes have been done. Even services like DLFileLocalServiceUtil (as you say) do not exist. But DLAppLocalServiceUtil appeared and it is not clear how it works. And as I haven't used jackrabbit or alfresco, I don't know much how to extend it. By "metadata" I mean http://en.wikipedia.org/wiki/Dublin_Core , roperties that all document I have mentioned do contain. – lisak Feb 28 '11 at 10:41
  • I checked 6.0.5 CE and 6.0.11.1 EE SP1 and both contain (for example) `com.liferay.portlet.documentlibrary.service.DLFolderLocalServiceUtil`in portal-service.jar and this jar is allowed to be referenced by other portlets by the ClassLoader heirarchy. I'm not sure what you're seeing. – David O'Meara Feb 28 '11 at 10:53
  • 1
    Sorry by 6.x I was referencing to 6.1 that is not released yet. I should have named it 6.1.x. It is being working on since October I guess. These classes have disappeared from trunk in November if I remember correctly... only DLLocalServiceUtil remains – lisak Feb 28 '11 at 11:01
  • If you look at Jira account, take a look at this feed http://issues.liferay.com/secure/ViewProfile.jspa?name=caorongjin . And click "show more" couple of times. Really massive changes have been done regarding document library. – lisak Feb 28 '11 at 11:03
  • Sorry I can't access that link (no privs) but I've used the same DL code in Liferay 5.2, 6CE, 6EE and 6EESP1 without issue. – David O'Meara Feb 28 '11 at 11:09
  • Without getting too side tracked, have you tried a vanilla installation of 6CE, start it up, add a Document Library portlet, upload a document and then download it to see if the metadata is retained? – David O'Meara Feb 28 '11 at 11:13
  • Yeah I've used DL many times. I can see what metadata are available from source code, but Liferay by default doesn't extract document metadata (Like via PdfBox or apache POI - usually done by apache tika). I need to extend the system to provide this functionality. But Jackrabbit and Alfresco put additional layer above apache tika to extract metadata. That's why I'm asking, because in 6.1.x it is very unclear. I'm not gonna use anything else than 6.1.x after so many changes have been done in DL – lisak Feb 28 '11 at 11:27
  • Sorry initially I didn't see that you were referring to 6.1, but as that code is not generally available you would need to raise it with Liferay, I guess. – David O'Meara Mar 01 '11 at 10:07
1

those two services DLLocalServiceImpl and DLAppLocalServiceImpl both are and will, I suppose, important. The former one if for direct access to repository. Notice that when adding a file via this service you need to persist corresponding DlFileEntry into database and than reference that addFile(...., fileEntryId, ...).

The latter service is doing additional stuff for you, mainly asset management and workflow.

Regarding your use case, I would avoid using document library, because no metadata can go down into the JCR repository. Actually only metadata/custom properties that you could store would be custom properties AKA Expando feature of Liferay portal.

Best way for you seem to be implement your own jackrabbit hook to store data into repository and let Liferay document library use that repository.

lisak
  • 21,611
  • 40
  • 152
  • 243
0

You need to always use DLAppServiceUtil ( as Liferay instructs specifically ). Here is my working code that saves a file to the CMS:

public static void saveFileToCMS(ActionRequest aReq, long groupId, String fileName, File filenameWithPath) {
    try {
        ServiceContext serviceContext = ServiceContextFactory.getInstance(
                Group.class.getName(), aReq);

        // prevents duplicate entries based on unique title name
        Random rand = new Random();
        Integer suffix = new Integer(rand.nextInt(10000));

        DLAppServiceUtil.addFileEntry(groupId, 0, fileName, "application/vnd.ms-excel",
                fileName + suffix.toString(), "description goes here", "changelogname",
                filenameWithPath, serviceContext);

        //log.info("Successfully added the new file");

    } catch (PortalException pe) {
        log.error("Portal Exception occurred while saving file to CMS");
        pe.printStackTrace();
    } catch (SystemException e) {
        log.error("System Exception occurred while saving file to CMS");
        e.printStackTrace();
    }
}
Sean Gildea
  • 321
  • 3
  • 4
0

Think Edgar is correct. If you check the current trunk via http://svn.liferay.com/repos/public/portal/trunk/portal-service/src/com/liferay/documentlibrary/service/DLLocalService.java (login as guest and no password), you will no longer find the class DLFolderLocalServiceUtil. We are using the existing DLFolderLocalServiceUtil class as well. Thanks for the heads up. We will refactor our code so when 6.1 comes around we can still use the DocumentLibrary services.