12

According to RFC, in multipart/form-data content-disposition header filename field receives as parameter HTTP quoted string - string between quites where character '\' can escape any other ascii character.

The problem is, web browsers don't do it.

IE6 sends:

Content-Disposition: form-data; name="file"; filename="z:\tmp\test.txt"

Instead of expected

Content-Disposition: form-data; name="file"; filename="z:\\tmp\\test.txt"

Which should be parsed as z:tmptest.txt according to rules instead of z:\tmp\test.txt.

Firefox, Konqueror and Chrome don't escape " characters for example:

Content-Disposition: form-data; name="file"; filename=""test".txt"

Instead of expected

Content-Disposition: form-data; name="file"; filename="\"test\".txt"

So... how would you suggest to deal with this issue?

Does Anybody have an idea?

Stephen Kennedy
  • 20,585
  • 22
  • 95
  • 108
Artyom
  • 31,019
  • 21
  • 127
  • 215

2 Answers2

5

Though an old thread, adding the below java solution for whoever might be interested.

// import com.sun.xml.internal.messaging.saaj.packaging.mime.internet.*;

    try {
        ContentDisposition contentDisposition = new ContentDisposition("attachment; filename=\"myfile.log\"; filename*=UTF-8''myfile.log");
        System.out.println(contentDisposition.getParameter("filename"));
    } catch (ParseException e) {
        e.printStackTrace();
    }
Pavan Kumar
  • 4,182
  • 1
  • 30
  • 45
  • Since the question is not particular to Java, an explanation of how this solves the problem would be useful. – Nisse Engström May 18 '16 at 18:25
  • 1
    Agreed. While looking for the same problem, I even found a thread discussing the regex pattern (http://stackoverflow.com/a/27226712/3940047). Added this solution as it might help someone in same context. People just google with appropriate keywords and can land here and if they happen to be Java guys, might find it useful. – Pavan Kumar May 19 '16 at 06:51
  • @PavanKumar totally agree, this is should be a language agnostic solution considering the question didn't mention Java. But as I always say, if you have the option, **always use a well defined library for parsing**. – Krusty the Clown Aug 22 '23 at 16:37
2

Is there a reason that you need to parse this filename at all?

At least the one thing that's consistent is that the filename portion of the header ends with a double quote, so you just need to read in everything between filename=" and the final ".

Then you can probably treat any backslash other than \\, \" or \" as a literal backslash, unless you think it's particularly likely that users will be uploading filenames with tabs in them. :)

Christopher Orr
  • 110,418
  • 27
  • 198
  • 193
  • 4
    "Is there a reason that you need to parse this filename at all?" -- yes I want to know the file name ;). "At least the one thing that's consistent is that the filename portion of the header ends with a double quote," The filename and name fields should not come in this specific order, so it is bad idea to suppose that file-name ends with last quotation mark. – Artyom May 30 '10 at 12:16
  • Want != need. ;) Ok, so you're at least guaranteed that it'll end with `"` or with `"; ` -- with this lack of consistency you have to make some concessions, like relying on the fact that users won't put `"; ` in the middle of their file names :) Alternatively, are you using a web framework that supports a best-effort parsing of this attribute for you? – Christopher Orr May 30 '10 at 14:05