0

I have a generic handler that serves files for download:

    Dim request As HttpRequest = context.Request
    Dim response As HttpResponse = context.Response
    response.ContentType = "application/octet-stream"
    response.AddHeader("content-disposition", "inline; filename=" & filename)
    response.Buffer = True
    response.OutputStream.Write(fileBytes, 0, fileBytes.Length)
    response.Flush()
    response.Close()

('fileBites' is my bite array, 'filename' is my file name).

When fileBites is, say, a .txt file - the download is triggered and the file is read perfectly.

I discovered, however, that .pdf and .docx files were being corrupted - In the case of .docx, Word was saying that the file needed to be recovered and asked me for permission to do so. When I granted this permission it fixed it immediately and displayed perfectly.

Obviously I didn't want users to see this corruption dialogue and after researching for a while I discovered this: http://forums.asp.net/t/1301978.aspx/1/10 - which suggested that the reason for the corruption was one extra empty bit was being written at the end of the byte array: I checked by dropping the length by one bit:

response.OutputStream.Write(fileBytes, 0, fileBytes.Length - 1)

and like magic, .docx downloads now work! (This is not my current problem, I include it for context and in case anybody else has the same issue)

My current problem is that although .docx files are now streaming correctly, .pdf files are not. They seem to transfer in one piece (at the correct KB size) but when I try and open the downloaded file Adobe Reader X tells me:

Adobe Reader could not open xxxx because it is either not a supported file type 
or because the file has been damaged (for example, it was sent as an email 
attachment and wasn't correctly decoded).

There was a fairly long unresolved discussion on the adobe forums dated 2008 (http://forums.adobe.com/thread/391712) that addresses this exact issue but this is now dead. I have tried all of the workarounds that users have posted (content type: /pdf not /octet, disposition: application not inline, different content-encodings and charsets, etc) but all to no avail.

I wonder if anybody has encountered this problem before that could point me somewhere vaguely approximating something that even remotely resembles the right direction!

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Dom Vinyard
  • 2,038
  • 1
  • 23
  • 36
  • 2
    Do you have a sample damaged PDF file that I can look at? – iPDFdev May 11 '12 at 12:57
  • "Obviously Adobe Reader embedded the name of the file into the binary object when it was saved" this so not true...For a start, Adobe Reader in general does not generate PDF files. I do not know what solved your issue, but I am 99.9999% sure this was not the cause. – yms May 11 '12 at 13:57
  • Hmm, It will only open if it has the original filename sent in the content-disposition? It must be validating against something. - I would have said that it was embedded by the application that created the PDF but then realised that this would need to be maintained if the file was renamed within the file system - and I can't imagine the file-system would re-embed its own path, but from a de-constructive pov, if it only works when send the exact filename then this filename must be embedded somewhere in the binary object to recognise a match? – Dom Vinyard May 11 '12 at 14:29
  • @AtheistforPaytheist Maybe the filename you were setting before contained invalid characters? Or maybe you were sending something that did not have ".pdf" extension? – yms May 11 '12 at 14:39
  • Definitely not the case. response.AddHeader("content-disposition", "inline;filename=test.pdf") does not work. – Dom Vinyard May 11 '12 at 14:53

1 Answers1

0

(Answer in the comments and edits. See Question with no answers, but issue solved in the comments (or extended in chat) )

The OP wrote:

Having stared at it for hours, the answer appeared shortly after posting this question - which is so often the way! At any rate, here is the resolution for anybody else stuck with my specific issue:

When I added a file into the database, I also allowed it to be renamed. I would select a file, give it a name and store it in the DB as [Fileblob],[Filename] - I could pick any arbitrary name, I assumed, because it was no longer tied to a specific location in the file system. - Wrong! - With .txt and .docx files, this was fine, the original name was never invoked.

Obviously something has embedded the name of the file into the binary object when it was saved and checks the name provided in the content-disposition against the name embedded into the document when it is opened. It then throws a corruption error if they do not match.

Now, I am storing the file in the database as [Fileblob],[Filename],[originalFilename] and when opening it I am using:

response.AddHeader("content-disposition", "inline;filename=" & originalFilename)

..to give it a name it understands. I suppose the more elegant way would be to strip the original name from the PDF when storing it in the database as it is no longer needed but as a workaround this works just fine.

Community
  • 1
  • 1
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129