3

I am running into a situation where files are being uploaded to a server that have non-printing UTF-8 characters in the filename. I know how to fix the names, but I'd like to be able to create files for testing, and I'd also like to understand how people might be accidentally (or intentionally) doing this in the first place.

So, with that in mind, what are the possible ways that a person could create filenames that have non-printing characters in them? In this case, it's DELETE (U+007F), but I'm interested in any non-printing character.

I am looking for methods that are easy to accomplish intentionally on the command line (linux, unix, and DOS), but also for ways that a person might accidentally do this via command line or GUI (windows, OSX, linux).

thomij
  • 194
  • 2
  • 10
  • Can you expand on how they're being uploaded? The reason I ask is, if you take HTTP for example, the filename is typically going to be provided as part of request using the [multipart/form-data](https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2) content type. A browser will use the filename of the selected file. But if you're interested in abuse vectors, then you should note that the value provided can be anything (even a name that cannot possibly exist on a given filesystem), if the request is constructed outside of a browser. – nj_ Mar 11 '16 at 19:13
  • @nj_ that is a good point. In this case, the files are uploaded over HTTP using multipart/form-data, presumably via the client's browser. I'm more interested in accidental vectors than abuse vectors, but I would be happy to learn about any possible scenarios. – thomij Mar 11 '16 at 19:26

1 Answers1

7

On Un*x/Bash, you can create file using byte literals by doing:

$ touch `echo -e "\x7f.txt"`

You can then verify it with:

$ ls -b
\177.txt
Alastair McCormack
  • 26,573
  • 8
  • 77
  • 100