0

I'm using below code inside a servlet for reading and writing PDF in application, but the read() method is getting blocked for some PDFs after reading some bytes:

public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException{
    InputStream is = null;
    OutputStream oos = null;
    try {
         String pdfPath = (String) request.getSession().getAttribute("viewPdfPath");
         
         File file=new File(pdfPath);
         
         oos = response.getOutputStream();
         response.setContentType("application/pdf");
         byte[] buf = new byte[8192];
    
         is= new FileInputStream(file);
         int c = 0;
         while ((c = is.read(buf, 0, buf.length)) > 0) { **//blocking after reading some bytes**
             oos.write(buf, 0, c);
             oos.flush();
         }
    
         oos.flush();
     } catch (FileNotFoundException e) {
            e.printStackTrace();
    }catch(Exception e){
        e.printStackTrace();
    } finally {
        if(oos != null)
            oos.close();
        if(is != null)
            is.close();
    }
}

The above code when executed from terminal as part of the standalone java class was successfully reading all bytes of the same PDF on the same Linux server where the application is currently hosted.

Why the InputStream read() method is getting blocked as part of application, but same code when executed from the same Linux server as part of the standalone java class was successfully reading without blocking?

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
  • 2
    Not related to your question, but I highly recommend that you learn how to use [try-with-resources](https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html) to simplify your code and prevent resource leaks that are possible in your current code. – Mark Rotteveel Jan 08 '22 at 08:25
  • 1 significant difference: "standalone java" app doesn't sound like "multi-threading"!? what happens when you (by accident) "hit the button twice"!? (in the same "session" ..without changing `viewPdfPath`) ... there will be two threads trying to read from the same file! – xerx593 Jan 08 '22 at 08:29
  • Every time method `doGet` is invoked, it runs in a separate thread. Try creating multiple threads in your standalone Java program where each thread tries to read the same PDF file and see if that blocks. – Abra Jan 08 '22 at 08:33
  • 2
    @xerx593 Why would [concurrent reads of the same file](https://stackoverflow.com/questions/9543984/is-it-safe-to-access-the-same-file-by-several-fileinputstream) cause indefinite blocking? – polo-language Jan 08 '22 at 08:48
  • Use `Files.copy(Path.of(pdfPath), oos)` instead of all that verbose code. How have you confirmed that `pdfPath` read is blocked? Get a thread dump of the app server (`kill -QUIT pid`) and also see open file handles of same server (`lsof |grep pid`). – DuncG Jan 08 '22 at 10:16
  • By the way the loop condition ought to be `>= 0` – DuncG Jan 08 '22 at 10:45
  • @DuncG We also tried with Files.copy(Path.of(pdfPath), oos) but still the same issue was there. In order to check that the read is blocked, we printed the count inside the while loop to check the number of times it got called for that PDF and found that the result (max count * 8192) was less than the PDF size. Also, we have put the logger in finally block and found that while reading it was not getting called for those PDF. – user17869627 Jan 10 '22 at 06:29
  • The block may be the writer to servlet output. How big is this PDF? Are there servlet filters in the chain? You need to see stack trace of the VM to confirm FIS read is locked, as my comment above (try also jstack). – DuncG Jan 10 '22 at 08:39

1 Answers1

0

According to the Javadoc, FileInputStream::read is a blocking operation:

Reads up to len bytes of data from this input stream into an array of bytes. If len is not zero, the method blocks until some input is available

When reading files locally from the same machine, it's likely that the read operation completes very quickly, but that does not mean that in theory it is not blocking for some milliseconds waiting for the OS, disk, etc. While reading from a remote machine, it is more likely that the blocking time is long enough for you to notice.

polo-language
  • 826
  • 5
  • 13