3

We have a requirement to write and read the excel file to and from the MarkLogic but we are getting exception while reading excel file from MarkLogic,

We want to pass the retrieved file to the XSSFWorkbook.java given by apache.poi.

I have tried the below code to write the Excel file to the MarkLogic,

    DatabaseClient client = databaseClientService.getContentClient();
    String contains = new String(Files.readAllBytes(Paths.get("src/test/resources/TestExcelEntity.xlsx")));
    BytesHandle bytesHandle = new BytesHandle();
    bytesHandle.setMimetype("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
    bytesHandle.setFormat(Format.BINARY);
    bytesHandle.set(contains.getBytes());

    BinaryDocumentManager manager = client.newBinaryDocumentManager();
    manager.writeAs("/test/binaryDoc.xlsx", bytesHandle);

Code to read Binay Excel file

    FileHandle fileHandle = new FileHandle();
    fileHandle.setMimetype("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
    fileHandle.setFormat(Format.BINARY);

    File file = manager.read("/test/binaryDoc.xlsx", fileHandle).get();

    XSSFWorkbook workbook = new XSSFWorkbook(file)

I can see the downloaded file in a temp location, but when I open the downloaded excel file I can see the error message as "The file is corrupted and can not be open" same error message I can see when I download it from qconsole.

Since the "/test/binaryDoc.xlsx" file is not getting downloaded/read properly so XSSFWorkbook.java is failing with an exception.

org.apache.poi.openxml4j.exceptions.InvalidOperationException: Can't open the specified file input stream from file: 'C:\Users\SHIVLI~1\AppData\Local\Temp\tmp9485717536946276215.vnd.openxmlformats-officedocument.spreadsheetml.sheet'

    at org.apache.poi.openxml4j.opc.ZipPackage.openZipEntrySourceStream(ZipPackage.java:162)
    at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:149)
    at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:277)
    at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:186)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:325)
    at com.ucbos.appdata.MLSample.test(MLSample.java:55)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.springframework.test.context.junit4.statements.RunBeforeTestExecutionCallbacks.evaluate(RunBeforeTestExecutionCallbacks.java:74)
    at org.springframework.test.context.junit4.statements.RunAfterTestExecutionCallbacks.evaluate(RunAfterTestExecutionCallbacks.java:84)
    at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:75)
    at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:86)
    at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:84)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:251)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:97)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
    at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:70)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:190)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
    at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
    at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220)
    at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53)
Caused by: java.io.FileNotFoundException: C:\Users\SHIVLI~1\AppData\Local\Temp\tmp9485717536946276215.vnd.openxmlformats-officedocument.spreadsheetml.sheet (The system cannot find the file specified)
    at java.base/java.io.FileInputStream.open0(Native Method)
    at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
    at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
    at org.apache.poi.openxml4j.opc.ZipPackage.openZipEntrySourceStream(ZipPackage.java:159)
    ... 35 more

Update - Tried BytesHandle to read the document as byte[] and then write it to the file system but still, I am getting the same error "The file is corrupted and can not be open".

        BytesHandle readHandle = new BytesHandle();
        readHandle.setMimetype("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
        readHandle.setFormat(Format.BINARY);
        readHandle.set(BYTES_BINARY);

        byte[] file = manager.read("/test/binaryDoc.xlsx", readHandle).get();

        File outputFile = new File("outputFile.xlsx");

        OutputStream os = new FileOutputStream(outputFile);
        os.write(file);
        os.close();

Excel file is getting saved in a file system enter image description here

I am not getting what is the wrong step I am doing here,

Could anyone help me to resolve this issue?

DevNinja
  • 1,459
  • 7
  • 10
  • In the error trace above, the exception that's the root cause seems to be that the file is not found by FileInputStream. What indicates that the file is corrupted? – ehennum Feb 26 '21 at 17:30
  • Thank you @ehennum, File is there in my system, see the attached image here, but I am not able to open it – DevNinja Feb 26 '21 at 17:58
  • Maybe the Java application isn't running as a user with permission to open the file in the file system? If you aren't able to open the file, the issue may not be an encoding issue. – ehennum Feb 26 '21 at 18:46

2 Answers2

2

From the description, the issue seems to be that the document retrieval and write to the OS is not working correctly, since it shows a corrupted file. I'm not a Java developer, but it appears that you are trying to access the document as if it were a regular document, and not a binary. For binaries it appears you either need to stream the binary file or buffer it with com.marklogic.client.io.BytesHandle

In Reading Content From A Binary Document it shows several examples. The following example looks closest to what you are trying to do:

byte[] buf = docMgr.read(docID, new BytesHandle()).get();

I would also suggest eliminating passing the document to XSSFWorkbook.java until you can verify that you are saving valid files to the temp location, to simplify the troubleshooting process.

Mike Gardner
  • 6,611
  • 5
  • 24
  • 34
  • Thank you @Michael Gardner for your reply. I updated my question please take a look, As you mentioned, I tried using BytesHandle but still, I am getting an error while opening the downloaded file. – DevNinja Feb 26 '21 at 17:13
1

If you just want to read/write the xlsx file, please use below Class representing the input stream as bytes instead of reading binary file as string.

InputStreamHandle handle = new InputStreamHandle();
handle.set(docStream);

docMgr.write(uri, handle);

Please assert the validity of written data, control flow and conditions prior further manipulation.

Validation options:

  • Use Java binary package, the common facility in testing framework, to assert the input being written without loss:
> Task :fc-financial-asset:TypedWriteReadStreamTest.main()

 Document /dmsdk/FXD.xlsx write completed.

 Assert /dmsdk/FXD.xlsx Input Stream and File BYTE – 

InputStream /dmsdk/FXD.xlsx bytes: 
11614

Calculate /dmsdk/FXD.xlsx byte array: 
11614

Read /dmsdk/FXD.xlsx file bytes:
11614
  • Rename the tmp*****.spreadsheetml.sheet to tmp*****.spreadsheetml.xlsx, you should be able to open the valid excel.
  • save or validate the document from QConsole.
Fiona Chen
  • 1,358
  • 5
  • 16