0

I have two huge files which I need to merge to another file. Currently I am using Apache Commons VFS to connect to sftp server and merge them. I am using the below logic to merge files

for(String file: filesToMerge){
try(             FileObject fileObject= utility.getFileObject();
                 OutputStream fileOutputStream= fileObject.resolveFile(pathToExport+"file3.txt").getContent().getOutputStream(true);
                 OutputStreamWriter outputStreamWriter = new OutputStreamWriter(fileOutputStream, "ISO-8859-1");
                 BufferedWriter bufferedWriter = new BufferedWriter(outputStreamWriter)
            ) {
          
                String content=  fileObject.resolveFile(pathToExport+file).getContent().getString("ISO-8859-1");
                bufferedWriter.write(content);
                log.info("process completed");

            } catch (Exception e) {
                log.error("Error while mergingfiles. The error is: " + e);
            } finally {
                log.info("closing FTP session ");
            }
            }

The files are very huge and I have limited memory. Is there any efficient way to merge the files faster instead of getting entire content as String ? Does use of any third party libraries like apache-commons-io instead of BufferedWriter improve performance?

1 Answers1

0

Yes, the proper way to do it is written at the top of the FileContent documentation page:

To read from a file, use the InputStream returned by getInputStream().

So replace what you have with:

fileObject.resolveFile(pathToExport+file).getContent()
        .getInputStream()
        .transferTo(fileOutputStream);

Be aware that your code is not currently merging the files, but overwriting file3.txt with each new file. You should probably append to the output stream instead of overwriting.

k314159
  • 5,051
  • 10
  • 32
  • Thanks for the response . I had a look at this link https://stackoverflow.com/questions/41205525/does-a-java-inputstream-help-or-hurt-memory-usage-with-large-files . So InputStreams does take up the memory as much as the size of file? I am reading articles and I am unable to get a clarity . – srinivas chaitanya Oct 10 '22 at 17:44
  • The idea is that you should use the InputStream to transfer only a small amount of content at a time, instead of reading all content into memory at once. – k314159 Oct 10 '22 at 17:53