1

So, I we have operations that are writing to a file on a server using a BufferedWriter. Before that BufferedWriter is flushed, I want to upload the contents to an S3 bucket. Currently (and inappropriately) I am collecting each string that the BufferedWriter is writing using a StringBuilder (appending on each time), but this is a huge string (~150mb). So it would be preferable to just simply write what is stored in the BufferedWriter directly. I have been scouring the internet and SO, but I cannot find a definitive answer to this question.

Is this possible and with very little code?

  • 1
    Why don’t you just write to both the file *and* S3 at the same time – Bohemian Nov 26 '18 at 22:38
  • 1
    You can wrapper the `BufferedWriter` around a `StringWriter`. – Andreas Nov 26 '18 at 22:40
  • Unfortunately, the file is being written sequentially to a server location and since S3 does not allow you to append to objects, I must wait until the file is completely written and then upload to S3. –  Nov 26 '18 at 22:40
  • Use a FilterWriter. Implement the 'write' methods to do what you want to do and then call the corresponding 'out.write' method. – Tom Drake Nov 26 '18 at 22:41
  • Why do you want to upload the content before BufferedWriter finishes it's job (that is, before closing the writer)? –  Nov 26 '18 at 22:48
  • I don't. I have to wait until all the writing is complete and then upload the contents of the buffer to an S3 bucket... –  Nov 26 '18 at 22:51
  • Then why do you keep it as a String instead of directly uploading the resulting file that physically exist somewhere on the server? –  Nov 26 '18 at 22:54
  • The file sits on EMR, so it would require a different approach to get the resultant file(s) to S3. I was hoping for a purely java solution to integrate into the current code base, (append to the file on EMR and then upload the contents to S3 when done), but It looks as though that will not be possible. There are options such as s3DistCp what we are looking at as well. –  Nov 26 '18 at 22:58

3 Answers3

1

A BufferedWriter is little more than a wrapper around some other Writer. So it will depend on what type of Writer was passed into its constructor.

Types of Writer that support the ability to read back include the CharArrayWriter and the StringWriter, which allow you to read the contents as a char[] and a String, respectively.

Joe C
  • 15,324
  • 8
  • 38
  • 50
1

Is it possible to get the contents of a BufferedWriter as a String?

No it is not possible to do that.

The BufferedWriter only holds "one buffer full" of the data that has been written. That isn't sufficient for what you need. The rest of the data will have been written to the file and will no longer be available in memory. (And besides, the writer's buffer is deliberately hidden behind an abstraction layer do that you can't get at it ... without doing "nasty" reflection.)

Now, you could add an extra component or components to the output stack to capture the output in memory. For example, you could use the Apache TeeOutputStream class (javadoc) to split out the data and write a 2nd copy into a ByteArrayOutputStream. Or you could write the 2nd copy directly to the S3 output stream.

Another way to do would be to "sink" the data you want to write into a ByteArrayOutputStream, extract the byte array and write it once to the file and a second time to the stream to the S3 bucket.

If the file is liable to be big, you may be better off avoiding anything that entails holding the entire file content in memory.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
0

You are using the BufferedWriter as a cache, to cache an entire stream, and then write it in two places.

What I use is a MemFile class which stores a stream in memory quite a bit more efficiently than a StringBuilder or a ByteArrayOutputStream because it does not have to allocate the memory in a single contiguous block.

This class is available open source at: https://github.com/agilepro/mendocino/blob/master/src/com/purplehillsbooks/streams/MemFile.java

These methods exist:

java.io.Reader  getReader();
java.io.Writer  getWriter();
void            outToWriter(java.io.Writer w);

Instantiate the class, get a Writer, write to it. Once it is full of the content, use outToWriter to stream first to the S3, and then to the file either using another Writer. Or use the Reader if that is more convenience.

The problem mentioned Writers which are character oriented, but there are also byte stream methods as well if you really mean to be working with bytes.

The documentation is at: http://purplehillsbooks.com/purpleDoc/

AgilePro
  • 5,588
  • 4
  • 33
  • 56