1

Without considering any implementation behaviour or host ᴏꜱ, are there by design, characters which aren't allowed in file or directory names?

I’m especially interested (considering Git is sometimes used as front-end) if the ᴀꜱᴄɪɪ ɴᴜʟ character is allowed.
If this isn’t allowed, can the attempt to create such file with rugged lead to remote code execution ?

user2284570
  • 2,891
  • 3
  • 26
  • 74
  • **No ! This isn’t too broad !** if there is a disallowed character there is a **definite answer !** *(yes it is)*. **Otherwise, please explain how to insert a filename containing a ɴᴜʟ byte in a ɢɪᴛ repository !** – user2284570 Oct 02 '15 at 20:22
  • Your title asked whether NUL is allowed; the body of the question asks whether some characters are *not* allowed. Since an existing answer says "Yes", meaning that some characters are forbidden, I've rephrased the title accordingly. (I also replaced the small capitals with ordinary uppercase letters; non-ASCII characters might be difficult to read on some systems.) – Keith Thompson Oct 02 '15 at 21:50

1 Answers1

4

Yes. The index and tree objects both impose limitations due to their design:

  1. A NUL cannot be in a path name. git uses null terminated strings to store filenames internally, both in the index and in tree objects.

  2. A / cannot be in a filename, as it is the path separator in the index.

There are additional limitations imposed by Git clients, which are not part of the design of the data file formats:

  1. A path component cannot be named .git.

  2. A path component may not be named . or .., to prevent you from escaping your working directory.

  3. If core.protectHFS is set, then when all zero-width Unicode characters are removed from a path component, the remainder may not be .git.

  4. If core.protectNTFS is set, then a path component may not be GIT~1, .git\ or .git followed by trailing spaces or dots.

And no, you cannot create those with libgit2 either, because it also uses null terminated strings to store paths. It seems unlikely that there's a buffer overrun potential here (if anything, you would expect a buffer underrun).

Edward Thomson
  • 74,857
  • 14
  • 158
  • 187
  • Wrong ! A`/`can be used in [git repositories](https://github.com/jch/html-pipeline). If you [clone this](https://github.com/jch/html-pipeline), the directory containing the`/`will be converted to 2 different directories. Doing a`git update`on that locally cloned repo will split the directory on most ᴏꜱ. – user2284570 Oct 02 '15 at 20:33
  • 1
    Are you talking about `lib/html`? That's a `lib` directory with an `html` directory within it. GitHub is collapsing that in their web UI to be nice - that's why the `lib/` is grey. – Edward Thomson Oct 02 '15 at 20:37
  • Do you mean this is automatic ? I’m sorry in that case… – user2284570 Oct 02 '15 at 20:39
  • @user2284570: It's not automatic. It's a github (a website) feature. Not a git feature. If you want such a feature in your software you have to implement it yourself like github did. – slebetman Oct 02 '15 at 21:17
  • @EdwardThomson : At least, I see many 500 errors with servers using libgit2 instead of 422 or 403 when I use other characters like & or >. – user2284570 Oct 02 '15 at 21:27
  • Then you should file a bug with them so they can investigate. – Edward Thomson Oct 02 '15 at 21:39
  • @EdwardThomson : `git uses null terminated strings to store filenames internally`I’m curious to look at other things. Would it be possible to get some documentations about binary data structures used by git objects ? *(with the objective of manually changing them of course)* – user2284570 Oct 03 '15 at 07:51
  • If you haven't seen them yet, https://git-scm.com/book/en/v1/Git-Internals and https://github.com/git/git/tree/master/Documentation/technical are the places to start. After that, the source for your favorite git implementation (git, libgit2, dulwich, jgit) is probably going to be best. – Edward Thomson Oct 03 '15 at 09:19
  • @EdwardThomson : Does the path restriction *(`.`and`..`in names)* is enforced by database design ? – user2284570 Oct 04 '15 at 15:18
  • @EdwardThomson : Ok I tested it and it isn’t. As the question is only asking about database and not any implementation behaviour, please update your question. – user2284570 Oct 06 '15 at 00:26
  • @user2284570 You didn't mention the database in your question. More importantly, you didn't mention *which* one. The index and tree objects both contain paths, and do so very differently and thus with different limitations. I'll point those out in my answer. – Edward Thomson Oct 07 '15 at 13:53
  • @EdwardThomson : All kind of objects. – user2284570 Oct 07 '15 at 17:00
  • That's why I gave you both. – Edward Thomson Oct 07 '15 at 17:00