In an attempt to upload a binary file to a web server, I observed that by setting the Content-Type
header to a value with charset="utf-8"
present, the POST request data integrity fails.
Chrome seems to omit all charset
attributes in both the header and the body while performing a file upload POST request all together.
However, some web servers satisfy the request correctly if the charset="utf-8"
attribute is included.
Working Example:
POST /upload.php HTTP/1.1
Content-Type: multipart/form-data; boundary=----FormBoundary
------FormBoundary
Content-Disposition: form-data; name="fileupload"; filename="data.bin"
Content-Type: application/octet-stream
------FormBoundary--
HTTP/1.1 200 OK
The file was received successfully.
Failing Example:
POST /upload.php HTTP/1.1
Content-type: multipart/form-data; charset="utf-8"; boundary=----FormBoundary
------FormBoundary
Content-Disposition: form-data; name="fileupload"; filename="data.bin"
Content-Type: application/octet-stream
------FormBoundary--
HTTP/1.1 200 OK
The uploaded file seems to be broken.
The addition of the charset="utf-8"
seems to make some web servers fail to decode uploaded data correctly, while it doesn't seem to affect the process in other web servers.
Please note that in both cases, including the charset after the application/octet-stream
type does not affect anything, I'm rather focusing on the addition after the multipart/form-data
type.
What happens in the server when the Unicode charset flag is present after the multipart type?