I need to display the PDF files from a third party on a webpage. I have links to the documents as they appear on the source pages. Unfortunately, none of the links are actual links to the documents, but rather GET requests with certain parameters, or other indirect references like so:
http://cdm.unfccc.int/UserManagement/FileStorage/SNM7EQ2RUD4IA0JLO3HCZ8BTK1VX5P
If the website does not enforce the download with Content-Disposition: attachment;
tag in the response headers, as the one above, then I can easily achieve the necessary display by:
<object width="90%" height="600" type="application/pdf"
data="http://cdm.unfccc.int/UserManagement/FileStorage/SNM7EQ2RUD4IA0JLO3HCZ8BTK1VX5P"
id="pdf_content">
<p>Can't seem to display the document. Try <a href="http://cdm.unfccc.int/UserManagement/FileStorage/SNM7EQ2RUD4IA0JLO3HCZ8BTK1VX5P">
downloading</a> it.</p>
<embed type="application/pdf" src="http://cdm.unfccc.int/UserManagement/FileStorage/SNM7EQ2RUD4IA0JLO3HCZ8BTK1VX5P"
width="90%" height="600" />
</object>
This "stands" and "falls" very gracefully in majority of the browsers. The use of <object>
and <embed>
at the same time works for me, and, as far as I've tested, does not effect the problem that I describe below (tell me if I'm wrong).
The problem begins when the website does require the download with the above mention tag in the HTTP-headers. For instance, the document on the following link:
http://mer.markit.com/br-reg/PublicReport.action?getDocumentById=true&document_id=103000000000681
would not be displayed through the HTML structure I showed above. It falls gracefully and the link for downloading works just fine, but I need to view it!
I've been banging my head on the wall for 3 days now, can't figure it out.
Maybe there is a way to catch the headers of the request somehow and ignore them, or maybe force the "viewability" into the GET request.
For general information, this is a part of Ruby on Rails application, so the solution should be coming from along those lines. I'm not giving any ROR code in here, because it doesn't seem to be a source of concerns.
Any straight-forward solution would be prayed upon, while any others - heavily appreciated.
The alternative solutions I thought of and discarding comments:
Download all those files to local storage in advance and just serve them from there.
The necessary storage capacity would be around ~1TB and growing, so storing it on the server would be expensive for a small commercial SaaS that it is.Cache those documents around the time when they might be needed. For instance, when someone opens the page of the project, the process in the background downloads the related PDFs, so if the user clicks the document link he is served the document which was just downloaded to the local storage. Cache could be kept for a few hours/days just in case of user return.
This might be viable, but if the user base would be significant, then this solution would have the same problem as the one above. Also at this moment, I would not know how to go about implementing this kind of algorithm (very much a beginner, you see)