0

I am trying to convert a file that has been downloaded into a byte[] and return it. The maximum size I can convert without it failing is around 70mb, which is the amount of free memory on my ubuntu instance. It would be unrealistic to need the file size worth of RAM to be able to download it.

I have tried a BufferedInputStream into a ByteArrayOutputStream but it runs out of memory the moment it begins writing. In the code below it will make it to "Starting buffered write" before it stops.

            FileInputStream fis = null;
            BufferedInputStream bis = null;
            byte[] bytes = null;
            byte[] buffer = new byte[1024];
            int count = 0;
            ByteArrayOutputStream bos = new ByteArrayOutputStream();
            
            try
            {
                fis = new FileInputStream(file);
                bis = new BufferedInputStream(fis);
                
                System.out.println("Starting buffered write");
                while((count=bis.read(buffer)) != -1)
                {
                    bos.write(buffer, 0, count);
                }
                System.out.println("Finished buffered write");
                
                //System.out.println("Trying copyLarge");
                //IOUtils.copyLarge(fis, bos);
                //System.out.println("Successful copyLarge");
                
                System.out.println("Starting stream to bytes");
                bytes = bos.toByteArray();
                System.out.println("Finished stream to bytes");
                
                fis.close();
                bis.close();
                bos.flush();
                bos.close();
            }

Something confusing me is that this method works for the upload function with large files no problem. The upload creates a temp file and an output stream from it and then writes the uploaded files input stream into it. Is it possible that since upload temp files are not being deleted after use, they are using up my instances memory?

dassoop
  • 11
  • 2
  • 4
    You can write a large file from network to disk or disk to network, but if you intend to hold the whole file in a `byte[]` then you need the memory for it. Don't make a method that returns a file as a `byte[]`, that's bad design (and if you really need it, there's `Files.readAllBytes(Paths.get("/path/to/file"))`). – Kayaman Feb 17 '21 at 23:43
  • That makes sense thank you. Is it possible to return a file as a ResponseEntity without turning it into a byte array? I am currently writing it to byte[], creating a ByteArrayResource, and return that as the body of a ResponseEntity. – dassoop Feb 18 '21 at 00:05
  • [Streaming](https://stackoverflow.com/questions/51845228/proper-way-of-streaming-using-responseentity-and-making-sure-the-inputstream-get) is always the key, so you don't keep things in memory when you don't need them. – Kayaman Feb 18 '21 at 08:51
  • Thank you! The returning them as a ResourceStream worked perfectly – dassoop Feb 21 '21 at 02:29
  • Strange enough, my POC (that deals with 250MB file data with heap set as `-Xmx50m`) app would throw OOME when I used `IOUtils.toByteArray` and works fine when I used `Files.readAllBytes` with no other variables changing. Heap dump analysis also suggested that `toByteArray` may be holding references to `byte[]` unnecessarily, although I couldn't validate it. Thanks for the tip @Kayaman – Cyriac George Mar 02 '22 at 00:23

1 Answers1

0

As Kayaman told me, loading the file as a byte array requires the full size of the file in memory. Returning my file for download as an InputStreamResource solved my problem without the byte array.

dassoop
  • 11
  • 2