0

I'm attempting to upload a (non-empty) PDF to Confluence using the Atlassian REST API (in Java). The issue I am having is that the PDF that ends up being uploaded has no content, but is exactly the same file size as the original PDF document!

Here is the code that I am using to write the contents of the request:

private String generateUploadContent( final ByteArrayInputStream is,
                                      final String fileName,
                                      final String boundary )
                                          throws Exception
{
    final String newLine = "\r\n";
    final StringWriter writer = new StringWriter();

    writer.write( "--" + boundary + newLine );
    writer.write(
        "Content-Disposition: form-data; name=\"file\"; filename=\"" +
            fileName + "\"" + newLine );
    if( fileName.endsWith( ".pdf" ) )
    {
        writer.write( "Content-Type: application/pdf" + newLine );
    }
    else
    {
        writer.write( "Content-Type: text/plain" + newLine );
    }
    writer.write( "Content-Transfer-Encoding: binary" + newLine + newLine );

    IOUtils.copy( is, writer );

    writer.write( newLine );
    writer.write( "--" + boundary + "--" + newLine );

    return writer.toString();
}

The ByteArrayInputStream 'is' is the input stream of the file that I'm attempting to upload. This code works perfectly for plain text files, just not PDFs.

If I do the following, the resulting file on my system is a PDF with all of the expected content, so I know the input stream is fine:

final File file = new File( "C:/file.pdf" );
final FileOutputStream fos = new FileOutputStream( file );

IOUtils.copy( is, fos );

The following is a complete list of headers that I am also sending with the content generated above as part of the request:

Authorization: Basic <!AUTH STUFF!>
X-Atlassian-Token: no-check
Content-Type: multipart/form-data; boundary=<!BOUNDARY!>

Based on my research so far (which has been quite extensive!), the common suggestions seem to be related to:

  • Using the "Cache-Control: no-cache" header
  • Providing/not providing the "Content-Length" header
  • Something to do with the encoding of the PDF contents when it is converted to a String (apparently the equal file size points to this being a problem)

I have only really had a good crack at the last solution above, but with no success.

IOUtils.copy( is, writer, StandardCharsets.UTF_8 );

The other two solutions didn't have much success in any of the forums that I read them in, so I haven't attempted them.

If anyone has an alternative solution, or can provide greater insight into any of the provided solutions, I would be very appreciative. I have spent too long trying to figure this one out on my own now.

If it helps, here is the Atlassian REST API reference that I have based my code off: https://docs.atlassian.com/atlassian-confluence/REST/latest-server/#content/{id}/child/attachment-createAttachments

I have tried using the suggested cURL command to upload the same file to the same page and the file is uploaded fine and contains the expected contents, its just my Java code that seems to be going wrong somewhere.

Thanks in advance,
Adam

EDIT: My unsuccessful attempt at base64 encoding:

if( fileName.endsWith( ".pdf" ) )
{
    writer.write( "Content-Type: application/pdf" + newLine );
    writer.write( "Content-Transfer-Encoding: base64" + newLine + newLine );

    final byte[] base64Bytes = Base64.encodeBase64( IOUtils.toByteArray( is ) );
    final String base64String = new String( base64Bytes );

    writer.write( base64String );
}
else
{
    writer.write( "Content-Type: text/plain" + newLine );
    writer.write( "Content-Transfer-Encoding: binary" + newLine + newLine );

    IOUtils.copy( is, writer );
}

writer.write( newLine );
writer.write( "--" + boundary + "--" + newLine );
Adam
  • 63
  • 3
  • 13
  • I have also just asked the Atlassian forums: https://answers.atlassian.com/questions/53754520/empty-pdf-on-rest-api-post – Adam Feb 15 '17 at 06:52
  • The main problem is that you are generating the upload content as a `String` in combination with a "Content-Transfer-Encoding: binary". For non-text files (like zips, docxs, pdfs,...) preparing the upload content as text (everything you handle in a `String` is text) usually damages them unless you base64-encode them (and use a Content-Transfer-Encoding: base64). – mkl Feb 16 '17 at 05:26
  • Thanks for replying. I also tried removing the "Content-Transfer-Encoding" header altogether for PDFs, and no luck. So are you saying if I use "base64" as the encoding in the header for PDFs I might have more success? Or would I need to make further changes to accompany that one to make it work? – Adam Feb 16 '17 at 08:45
  • P.S. I'll give this suggestion a crack when I get on the computer later, and report back with an outcome – Adam Feb 16 '17 at 08:47
  • *"So are you saying if I use "base64" as the encoding in the header for PDFs I might have more success? Or would I need to make further changes to accompany that one to make it work?"* - You have to set "base64" as content transfer encoding *and* (of course) you have to base64 encode. Or you change your code altogether and don't construct your upload content as a String but instead as a byte buffer or array. – mkl Feb 16 '17 at 12:15
  • I tried base64 encoding with no luck. See my edit above for the code that didn't work – Adam Feb 21 '17 at 23:37
  • Hhmmm, it looks as if it should. Probably you should use wireshark or some http proxy to sniff the requests as they actually go over the line for your code and for the suggested cURL command and compare them. – mkl Feb 22 '17 at 08:54
  • I didn't keep the files, but I have done this already using Fiddler, and the differences all appeared to be certain characters in the cURL request appearing as question marks '?' in the Java request. This, as well as the identical file sizes, is what points me to some kind of character encoding issue – Adam Feb 28 '17 at 02:55

0 Answers0