3

I'm using the following code to read large files:

InputStreamReader isr = new InputStreamReader(new FileInputStream(FilePath));
BufferedReader br = new BufferedReader(isr);
while ((cur = br.readLine()) != null)

I'm able to read large files using above code but I want to know how these readers works internally in memory. What role does inputstreamreader plays? How many chunks of memory gets allocated while reading a file(e.g 2 GB) line by line?

Mayank
  • 113
  • 3
  • 15
  • 2
    look at the source. – Lino Jun 22 '18 at 07:55
  • Source as in javadocs? I already went through that. My confusion is how the inputstreamreader and bufferedreader are linked – Mayank Jun 22 '18 at 07:59
  • a bufferedReader needs a source to read from. That is the only *link* there is. – Lino Jun 22 '18 at 08:03
  • So does that mean isr will keep on supplying the character streams to br and bufferedreader will store that in buffer? If thats the case how many characters does isr read(reads bytes and decodes them into characters) in one go from file – Mayank Jun 22 '18 at 08:12

1 Answers1

5

InputStreamReader is a facility to convert a raw InputStream (stream of bytes) to a stream of characters, according to some charset. FIleInputStream is a stream of bytes (it extends InputStream) from a given file. You can use InputStreamReader to read text, for instance, from a socket as well, as socket.getInputStream() also gives an InputStream.

InputStreamReader is a Reader, the abstract class for a stream of characters. Using an InputStreamReader alone would be inefficient, as each "readLine" would actually read from the file. When you decorate with a BufferedReader, it will read a chunk of bytes and keep it in memory, and use it for subsequent reads.

About the size: the documentation does not state the default value:

https://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html

The buffer size may be specified, or the default size may be used. The default is large enough for most purposes.

You must check the source file to find the value.

https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/io/BufferedReader.java

This is the implementation in the OpenJDK:

 private static int defaultCharBufferSize = 8192;

The Oracle's closed source JDK implementation may be different.

  • you should probably mention that this is from JDK7 – Lino Jun 22 '18 at 08:08
  • How does fileinputstream stream gets the stream of bytes from the input file? – Mayank Jun 22 '18 at 08:21
  • https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/io/FileInputStream.java Take a look at the code. It needs to communicate with the JVM for inter-operating with the host operating system. It's not easy to guess how it is done. – Danilo M. Oliveira Jun 22 '18 at 08:29
  • And also there's native (C or C++) code : private native void open(String name) throws FileNotFoundException; – Danilo M. Oliveira Jun 22 '18 at 08:31
  • It's better for you understanding the Reader/InputStream abstractions, and how the Java API uses decorators for enriching their basic functionality. – Danilo M. Oliveira Jun 22 '18 at 08:32
  • If fileinputstream will keep on generating stream of bytes in RAM for Inputstreamreader to decode then how come we don't run out of memory while reading large file? – Mayank Jun 22 '18 at 08:37
  • Because you will read pieces of the file each time, keeping them on Strings or a byte[] array, and those pieces will be garbage collected as they are no longer necessary. – Danilo M. Oliveira Jun 22 '18 at 08:38
  • 1
    Thats what I wanted to know. Thanks a lot! – Mayank Jun 22 '18 at 08:40