0

I'm working on a way of programatically accessing a Lotus Notes database to gather information on embedded attachments of records over a given period.

My goal is to find records over a given period, then use Apache-POI to get metadata about document size, character count, etc.

The POI part works fine, and so far, I've been able to access the Lotus Notes records thanks to this help:

lotus notes search by date with Java api

and this answer also shows me how to download/copy the attachments:

How do I get all the attachments from a .nsf(lotus notes) file using java

from there I could use my POI code do my job and at the end, just delete the copied attachments. This approach, basically works, but I want to avoid the overhead of copying, saving and then at the end deleting my copy of these attached documents from the database.

I tried passing the result of the EmbeddedObject getSource() method as an input to my POI code and got a FileNotFoundException in the POI code that was expecting a String to make a File.

Is there a way of getting a File reference I can pass to POI, without copying and saving the attachment? Or, what I mean is, is it as simple as getting a File (+path) for the Lotus Notes EmbeddedObject attachment, and how do I do this?


I found the answer and posted it below.

Community
  • 1
  • 1
grooble
  • 617
  • 1
  • 8
  • 27
  • Hey it's probably bad form to answer my own post, but I just had some luck with getInputStream() from the embeddeObject. I'll edit the original post with my nearly working solution. – grooble Jun 13 '12 at 04:42
  • POI will open things quite happily from an InputStream, what isn't working for you when you try that? – Gagravarr Jun 13 '12 at 09:33
  • 1
    @grooble, it is very good form to answer your own post. I think you are correct about getInputStream being the ideal solution: http://publib.boulder.ibm.com/infocenter/domhelp/v8r0/index.jsp?topic=%2Fcom.ibm.designer.domino.main.doc%2FH_INPUTSTREAM_PROPERTY_JAVA.html – Ken Pespisa Jun 13 '12 at 11:36
  • 1
    @grooble - you should post your answer as an answer and then accept it as the answer. You get a badge for doing that (maybe it is a silver one)! We all need badges.... – Newbs Jun 13 '12 at 14:07
  • 1
    The EmbeddedObject.getInputStream() method actually saves a temporary file on disk and then opens it for you, so your code is not really saving any of the runtime overhead. Also, be very careful because the temp file will be left on disk unless you do two things: a) close the InputStream, and b) recycle the EmbeddedObject. If you're doing this to lots of documents, you could run out of disk space. So you need to assign InputStream stream = eo.getInputStream(), then call verifytAndBuildPOIFS(stream), and then when everything is done you need to call stream.close() and eo.recycle(). – Richard Schwartz Jun 13 '12 at 14:25
  • @KenPespisa and rhsatrhs Thanks to both of you for the headsup about close() and recycle(). I've incorporated them in my program. I guess I'm coming from a servlet background where the container usually takes care of those details for you ;) – grooble Jun 14 '12 at 01:16

1 Answers1

1

Answering my own question...

...here's the solution I found a little while after posting the question above:

EmbeddedObject's getInputStream to the rescue...

  //from the answer in the link in the question above 
  Database db = agentContext.getCurrentDatabase();
  DocumentCollection dc = db.getAllDocuments();
  Document doc = dc.getFirstDocument();
  boolean saveFlag = false;
  while (doc != null) {
    RichTextItem body = 
    (RichTextItem)doc.getFirstItem("Body");
    System.out.println(doc.getItemValueString("Subject"));
    Vector v = body.getEmbeddedObjects();
    Enumeration e = embeddedObjs.elements();
    while(e.hasMoreElements()){
        EmbeddedObject eo = (EmbeddedObject)e.nextElement();
        if(eo.getType() == EmbeddedObject.EMBED_ATTACHMENT){

    //this next line gives Apache-POI access to the InputStream

                        InputStream is = eo.getInputStream();
            POIFSFileSystem POIfs = 
                              HWPFDocument.verifyAndBuildPOIFS(is);
            POIOLE2TextExtractor extractor = 
                              ExtractorFactory.createExtractor(POIfs);
            System.out.println("extracted text: " + extractor.getText());
                        is.close();  //closing InputStream 
                     }
                     eo.recycle();  //recycling EmbeddedObject

   //thanks to rhsatrhs for the close() and recycle() tip!
grooble
  • 617
  • 1
  • 8
  • 27