Not able to process and write pdf file from PDF response in java

Question

I am trying to convert a docx into pdf, microsoft graph already provides a functionality for that but thing is I am getting a PDF in response with content type "application/pdf" as a string/text from microsoft graph. I know the response is correct as the PDF is visible correctly in Postman but when I try to write it to a file manually in java code I am not able to do it properly.

I am able to create a PDF file of correct number of pages but the pdf has all blank pages.

the response body has text something like this:

%PDF-1.7
%����
1 0 obj
<</Type/Catalog/Pages 2 0 R/Lang(en) /StructTreeRoot 466 0 R/Outlines 420 0 R/MarkInfo<</Marked true>>/Metadata 3921 0 R/ViewerPreferences 3922 0 R>>
endobj
2 0 obj
<</Type/Pages/Count 135/Kids[ 3 0 R 15 0 R 19 0 R 25 0 R 26 0 R 29 0 R 31 0 R 32 0 R 33 0 R 36 0 R 37 0 R 38 0 R 40 0 R 42 0 R 43 0 R 44 0 R 45 0 R 46 0 R 47 0 R 48 0 R 49 0 R 50 0 R 51 0 R 52 0 R 53 0 R 54 0 R 55 0 R 56 0 R 57 0 R 58 0 R 59 0 R 60 0 R 61 0 R 62 0 R 63 0 R 64 0 R 65 0 R 66 0 R 67 0 R 68 0 R 69 0 R 70 0 R 71 0 R 72 0 R 73 0 R 74 0 R 75 0 R 76 0 R 77 0 R 78 0 R 79 0 R 80 0 R 81 0 R 82 0 R 83 0 R 84 0 R 85 0 R 86 0 R 87 0 R 88 0 R 89 0 R 90 0 R 91 0 R 92 0 R 93 0 R 94 0 R 95 0 R 96 0 R 97 0 R 98 0 R 99 0 R 100 0 R 101 0 R 102 0 R 103 0 R 104 0 R 105 0 R 106 0 R 107 0 R 108 0 R 109 0 R 110 0 R 111 0 R 112 0 R 113 0 R 114 0 R 115 0 R 116 0 R 117 0 R 118 0 R 119 0 R 120 0 R 121 0 R 122 0 R 123 0 R 124 0 R 125 0 R 126 0 R 127 0 R 128 0 R 129 0 R 130 0 R 131 0 R 132 0 R 133 0 R 134 0 R 135 0 R 136 0 R 137 0 R 138 0 R 139 0 R 140 0 R 141 0 R 142 0 R 143 0 R 144 0 R 145 0 R 146 0 R 147 0 R 148 0 R 149 0 R 150 0 R 151 0 R 152 0 R 153 0 R 154 0 R 155 0 R 156 0 R 157 0 R 159 0 R 161 0 R 166 0 R 170 0 R 171 0 R 172 0 R] >>
endobj
3 0 obj
<</Type/Page/Parent 2 0 R/Resources<</Font<</F1 5 0 R/F2 9 0 R>>/ExtGState<</GS7 7 0 R/GS8 8 0 R>>/XObject<</Image11 11 0 R/Image13 13 0 R>>/ProcSet[/PDF/Text/ImageB/ImageC/ImageI] >>/MediaBox[ 0 0 612 792] /Contents 4 0 R/Group<</Type/Group/S/Transparency/CS/DeviceRGB>>/Tabs/S/StructParents 0>>
endobj
4 0 obj
<</Filter/FlateDecode/Length 702>>
stream
x���Qo�0�ߑ��h��`� U����2-R�0���Mi�D�����~g�6�   +Mh"Y`|�w�;�=YU�]<����=��x~��•�=��{/�<��"wg�7����ķ�������-�Q�%h!����༞\%�����y�0�-��c���Иk4��1����|�a�`[,�`}vn[W�k����)��|>�̣�o�u��h+��5��p"D������N[�f�ˏ�/�2��

MY code:


            response= rest.get(url){
                header "Authorization", "Bearer "+updateBulkPublishPdfDetailsData.sharepointAccessToken
            }
RestResponse resp=sharepointUploadSummaryResponse.getResponseEntity()
            if(resp.getHeaders().getContentType().toString()=="application/pdf"){
                String pdfString=resp.body;
                FileOutputStream out = new FileOutputStream(new File("C:\\Users\\faizanahemad.shaikh\\Downloads\\abcd.pdf"));
                out.write(pdfString.getBytes());
                out.flush();
                out.close();
            }

Basically I am using groovy grails but java solution will work, I also tried all the available solutions like receiving it as blob or directly receiving it as bytearray or converting it to inputstream and writing it but none of them worked, funny thing is I know I am receiving correct response as I am able to download it well from postman.

Also I did notice the size of file created from response is a bit more in postman like slightly more for eg: the file I create has size 1487 KB but the file I download from postman has 1563 KB and thing is both postman and my app receives same size of data, content-length is same for both: 1599780

Response:

I know I am making some mistake in processing and writing the file

`if(resp.getHeaders().getContentType().toString().equals("application/pdf")){` is what you need, although I suspect the `toString` is redundant. You don't compare java `String` with `==` — g00se, Jul 19 '23 at 14:45
No, that condition is working perfectly fine. My issue is that when I try to write pdf that I got in response I am seeing all the pages blank but the number of pages are correct, also the data I am receiving is correct as it is same as the one I receive in postman and the PDF is perfectly visible in postman so I think it has something to do with processing or decoding or idk before writing it to file. — Faizan Shaikh Sarkar, Jul 19 '23 at 14:48
@g00se basically I am not able to download(write it into local pdf file) it properly from response in java/grails. — Faizan Shaikh Sarkar, Jul 19 '23 at 14:50
If it is, it's by sheer luck. You shouldn't get the body as a `String` - it *isn't* one. Which API are you using? — g00se, Jul 19 '23 at 14:51
@g00se the language I am using is Groovy/grails/java the above code is in groovy that is why it is working fine also the ResponseEntity body is in text format only, I will attach a screenshot so you get the idea of the response. — Faizan Shaikh Sarkar, Jul 19 '23 at 14:56
Oh right, about the Groovy thing. But how can a binary file be in text format? — g00se, Jul 19 '23 at 14:58
@g00se I have attached the screenshot of the response, it is in text and content-type is "application/pdf". I am confused myself maybe I guess it is in plain text pdf format? — Faizan Shaikh Sarkar, Jul 19 '23 at 15:01
I don't use Postman myself but maybe this is because you *asked for* text format. As you can see, all those weird characters in the response are basically telling you that the content can't be represented as a string, and shouldn't be requested as/parsed as one — g00se, Jul 19 '23 at 15:06
The ResponseEntity body is receiving PDF data as a String only as I tried to check the class of the body and it displayed:- class java.lang.String So yes the response is in String only, most likely some sort of plain pdf String data or maybe binary pdf data as string? maybe encoded? or something like that. I personally tried everything but none worked. — Faizan Shaikh Sarkar, Jul 19 '23 at 15:08
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/254572/discussion-between-g00se-and-faizan-shaikh-sarkar). — g00se, Jul 19 '23 at 15:10

score 0 · Accepted Answer · answered Aug 15 '23 at 09:04

I wasn't able to find a exact solution but as I wanted to convert docx to PDF via the Micorsoft Graph, I was able to resolve it by using Microsoft Graph Java SDK. Using Microsoft Graph SDK I was able to get the converted PDF file as a InputStream and was able to get all the data perfectly.

//targetFormat is "pdf"
public static byte[] downloadConvertedFile(GraphServiceClient<Request> graphClient, String fileId, String targetFormat) throws IOException, ClientException {
        try {
            InputStream stream = graphClient
                    .customRequest("/drives/" + driveId + "/items/" + fileId + "/content", InputStream.class)
                    .buildRequest(new QueryOption("format", targetFormat))
                    .get()
            if (stream != null) {
                return IOUtils.toByteArray(stream)
            }
        }catch(Exception ex){
            throw new IOException("Failed to read file from response");
        }
    }

No need for 3rd party most probably: `return stream.readAllBytes()` Try-with-resources is your friend, too. Your exception handling is not correct either but I don't have time to go into that — g00se, Aug 15 '23 at 09:12

Not able to process and write pdf file from PDF response in java

1 Answers1