7

My Java application requires access to a large excel file (1GB+ in size) saved on remote shared folder. I'm using SmbFile to get the file with authentication.

Note: Downloading of the file is not an option mainly for size reasons.

The problem is that I need the excel file to be a Java IO File and not SmbFile since the other libraries that I'm using to parse the excel only accepts Java IO File.

  1. Is there any way to convert this SmbFile into a Java compatible File?
Adam Michalik
  • 9,678
  • 13
  • 71
  • 102
arash moeen
  • 4,533
  • 9
  • 40
  • 85
  • 2
    Does the other library accept anything else other than a `File`? Eg. an `InputStream`? If yes, you can use `SmbFile.getInputStream()`. If no, you can download the file locally (eg. to a temp file) and use that for the other library. Would that work for you? – Adam Michalik Apr 04 '16 at 09:45
  • The library works with InputStream but the problem is it also closes the stream after each excel sheet is processed and then I have to open it again for the next sheet which I think it should be ok. Downloading the file is not an option due to the size of the excel (1GB+). I will try the InputStream and will let you know. – arash moeen Apr 04 '16 at 09:55
  • 1
    Can you include what library you use for excel parsing? Maybe this brings up more people to help. – mad_manny Apr 04 '16 at 11:16
  • 1
    And if it is just the `close()` call that is a problem, then you might want to wrap the InputSTream with a dummy implementation that intercepts the close and ignores it. Beware! This is quite a hack and only suggested as a "last resort" if there is no other solution and re-opening is to expensive or painful. – rpy Apr 06 '16 at 20:34
  • @mad_manny I'm using a wrapper over apache poi (https://github.com/monitorjbl/excel-streaming-reader). It accepts both file and inputstream. I think I should be using inputstream instead of file. – arash moeen Apr 07 '16 at 07:25

5 Answers5

8

See implementation details of your library:

This library will take a provided InputStream and output it to the file system. (...) Once the file is created, it is then streamed into memory from the file system.

The reason for needing the stream being outputted in this manner has to do with how ZIP files work. Because the XLSX file format is basically a ZIP file, it's not possible to find all of the entries without reading the entire InputStream.

(...) This library works by reading out the stream into a temporary file. As part of the auto-close action, the temporary file is deleted.

If you need more control over how the file is created/disposed of, there is an option to initialize the library with a java.io.File. This file will not be written to or removed

So it doesn't matter if you use the File or InputStream API - the whole file will need to be downloaded anyhow.

The simplest solution is to pass the SmbFile.getInputStream() to

StreamingReader.builder().read(smbFile.getInputStream())

but alternatively you can first download the file eg. by means of IOUtils.copy() or Files.copy()

File file = new File("...");
try (
     in = smbFile.getInputStream();
     out = new FileOutputStream(file)
) {
    IOUtils.copy(in, out);
}

or

try (in = smbFile.getInputStream()) {
    Files.copy(smbFile.getInputStream(), file.toPath());
}

and pass file to

StreamingReader.builder().read(file)
Adam Michalik
  • 9,678
  • 13
  • 71
  • 102
  • Thank you so much for actually spending time checking that library as well. Let me explain the main issue here. I cannot copy the file on my local server because of the file size which increases the time (there are tons of validation checks has to happen to 1kk+ rows with 338 columns therefore downloading the file is off the table. – arash moeen Apr 08 '16 at 03:50
  • The reason I'd rather not to use inputStream and pass it to that library is that it doesn't provide any way to know how many sheets are there prior to reading, it just tells you if the sheet is available if you provide name for it and I need the number of sheets. So my work around was to unzip the file and check the worksheets folder and count, which can be done using Apache POI easily but using InputStream will thrown an exception due to the file size therefore that can be done only if the file can be read as Java File – arash moeen Apr 08 '16 at 03:53
  • I'm trying to host the file completely on my server folder in order to have local access to it, that would simplify everything for it. But again thanks so much for your help – arash moeen Apr 08 '16 at 03:54
  • @arashmoeen - interesting use case. I can't think of any better option other than doing that doing that kind of processing on the server where the file actually resides - just as you said. Good luck! – Adam Michalik Apr 11 '16 at 07:09
2

Using Apache Commons IO library

https://mvnrepository.com/artifact/commons-io/commons-io

NtlmPasswordAuthentication auth = new NtlmPasswordAuthentication("", "user", "key");

SmbFile smbFile = new SmbFile("smb://IP/pitoka.tmp", auth)

InputStream initialStream = smbFile.getInputStream();

File targetFile = new File("/tmp/pitoka.tmp");

FileUtils.copyInputStreamToFile(initialStream, targetFile);

I hope help you.

1
    jcifs.smb.SmbFile smbFile = new SmbFile("smb://host/fileShare/.../file");
    java.io.File javaFile = new File(smbFile.getUncPath());

    System.out.println(smbFile);
    System.out.println(javaFile);

Output

smb://host/fileShare/.../file
\\host\fileShare\...\file

javadoc of smbFile.getUncPath() says

Retuns the Windows UNC style path with backslashs intead of forward slashes.

I am using jcifs-1.3.17.jar on Windows 10.

rsinha
  • 683
  • 6
  • 9
0

Recently i had a similar situation, however, I hadn't found a good solution in the internet, but I wrote a basic code that did what I need easily.

In your case, you will need to copy the excel file from the source (Remote Directory) using SmbFile with authentication to the destination (Local Directory) and only after, convert the excel file path of the destination (getCanonicalPath() function) and convert it from SmbFile format to File format with the code below. After, create your File object with the file destination path and do what you want.

I use JCIFS to work with remote shared directories using the SMBFILE class.

First, you need to import the main libraries:

import java.io.File;
import java.io.IOException;
import jcifs.smb.SmbFile;

Second, you need to create a static method to convert from SmbFile format to File format:

/**
 * This method convert a directory path from SmbFile format to File format.<br />
 * <p><strong>Sintax:</strong> <br />&nbsp;&nbsp;&nbsp;&nbsp;convertSmbFileToFile("Canonical Path")</p>
 * <p><strong>Example:</strong> <br />&nbsp;&nbsp;&nbsp;&nbsp;convertSmbFileToFile("smb://localhost/D$/DOCUMENTOS/workspace/tests2/access")</p>
 * @param smbFileCanonicalPath String
 * @see String
*/
public static String convertSmbFileToFile(String smbFileCanonicalPath) {
    String[] tempVar = smbFileCanonicalPath.substring(6).replace("$", ":").split("/"); 
    String bar = "\\";
    String finalDirectory = "";
    for (int i = 1; i < tempVar.length; i++) {
        finalDirectory += tempVar[i] + bar;
        if (i == tempVar.length - 1) {
            finalDirectory = finalDirectory.substring(0,finalDirectory.length()-1);
        }
    }
    return finalDirectory;
}

Opcional, you could also create a static method to convert from File format to SmbFile format:

/**
 * This method convert a directory path from File format to SmbFile format.<br />
 * <p><strong>Sintax:</strong> <br />&nbsp;&nbsp;&nbsp;&nbsp;convertFileToSmbFile("Canonical Path")</p>
 * <p><strong>Example:</strong> <br />&nbsp;&nbsp;&nbsp;&nbsp;convertFileToSmbFile("D:\DOCUMENTOS\workspace\tests2\access")</p>
 * @param fileCanonicalPath String
 * @see String
*/
public static String convertFileToSmbFile(String fileCanonicalPath) {
    return "smb://localhost/" + fileCanonicalPath.toString().replace(":", "$").replace("\\", "/");
}

Finally, you can call the methods like the below example:

String dirDest = "access/";

    try {

        File localDirFile = new File(dirDest);

        SmbFile localSmbDirFile = new SmbFile(convertFileToSmbFile(localDirFile.getCanonicalPath()));
        File localDirFile2 = new File(convertSmbFileToFile(localSmbDirFile.getCanonicalPath()));

        System.out.println("Original File Format: " + localDirFile.getCanonicalPath());
        System.out.println("Original File Format to SmbFile Format: " + localSmbDirFile.getCanonicalPath());
        System.out.println("Converted SmbFile Format to File Format: " + localDirFile2.getCanonicalPath());

    } catch (IOException e) {

        System.err.println("[ERR] IO Exception - " + e);

    }

Result of previous code run:

Original File Format: D:\DOCUMENTOS\workspace\tests2\access
Original File Format to SmbFile Format: smb://localhost/D$/DOCUMENTOS/workspace/tests2/access
Converted SmbFile Format to File Format: D:\DOCUMENTOS\workspace\tests2\access

Extra Information: getCanonicalPath()

Maybe this code will help you and I am available to talk about if you want.

Good Luck!

0

It's just a matter of structure I guess, with SmbFile we have two arguments while with File we have just one argument. So, my Idea is to declare a File with the same path of the SmbFile and try to handle your file. For example, in my I want to delete recursively the content of my folder :

SmbFile sFile = new SmbFile(path, auth)

if (sFile.exists()) {
 File file = new File(path);
 deleteDirectory(file);

}

 boolean deleteDirectory(File directoryToBeDeleted) {
    File[] allContents = directoryToBeDeleted.listFiles();
    if (allContents != null) {
        for (File file : allContents) {
            deleteDirectory(file);
        }
    }
    return directoryToBeDeleted.delete();
}

I hope this peace of code help you, and sorry for my english !

LAGHRAOUI
  • 21
  • 3