18

In addition to - and _, which other special characters can be contained in a github repository name?

Background

I need to do some regex on github urls, and need to know the rules for repository root urls, which are of the form

https://github.com/username/repo

where

  • username is the username of the owner of the repository, and,
  • repo is the repository name

So far, my regex works well, but doesn't cater to repositories with special characters, so I must include them. Written in R, the regex is github.com/*/[[:alpha:]].

Note: Here are listed the rules for github usernames - I am after the same thing but for repository names

stevec
  • 41,291
  • 27
  • 223
  • 311

1 Answers1

26

2019: As mentioned in moby/moby issue 679:

it looks like github allows [A-Za-z0-9_.-], and transforms all other characters to "-".

So: in addition to letters, numbers, - and _ the only other allowable character is '.'

This is illustrated in GitHub Desktop application, with desktop/desktop issue 3090: "Block emoji from being entered as a repo name"(!)


2023: Qunatized mentions in the comments:

I just checked on GitHub and was able to create repositories that:

  1. start with "." or "_",
  2. end with "." or "_",
  3. contain an arbitrary number of consecutive "." or "_" characters, or any combination thereof.

It only converts any characters outside of [A-Za-z0-9_.-] to "-".

I checked, and a repository name can also start or end with '-', in addition of '.' and '_'.

So the current regexp (June 2023) for valid GitHub repository name would be:

^[\w-\.]+$
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • So `[\w.-]+` matches usernames and repos. Thus, for example, something like `match(/github\.com\/([\w.-]+)\/([\w.-]+)/)` would allow extracting username and repo in captured groups. – payne Jan 14 '23 at 18:32
  • @payne Thank you for this interesting feedback. I have included your comment in the answer for more visibility. – VonC Jan 14 '23 at 20:02
  • Repos can also include underscores, though - so this wouldn't work for everything! and usernames can't contain dots. Edit: oops. I didn't realise what \w did quite.. probably best to remove '.' from the username though – TheKodeToad Jun 03 '23 at 17:17
  • @TheKodeToad 3+ years later, I find those regexes a bit too simple to accurately match GitHub valid repository names or GitHub usernames. I have edited the answer. Let me know what you think. – VonC Jun 03 '23 at 19:17
  • Oh, thanks! I saw an npm package for username regex but this seems simpler – TheKodeToad Jun 04 '23 at 08:57
  • 1
    @VonC I just checked on GitHub and was able to create repositories that 1) start with "." or "\_", 2) end with "." or "\_", and 3) contain an arbitrary number of consecutive "." or "\_" characters, or any combination thereof. It only converts any characters outside of [A-Za-z0-9_.-] to "-". Am I missing something? Otherwise the regex should be "^[A-Za-z0-9_.-]+$". – Qunatized Jul 05 '23 at 14:48
  • @Qunatized Interesting. So the naming policy would have evolved? – VonC Jul 05 '23 at 15:50
  • @VonC I'm not sure how it was before and unfortunately I couldn't find any official documentation. But I just tested it today and it seems to work. Maybe you could also verify and then update your answer? – Qunatized Jul 05 '23 at 16:50
  • 1
    @Qunatized I checked, and the new regexp should now be `^[\w-\.]+$` indeed. I will edit the answer... as soon as I can, since [I can no longer edit answers at the moment](https://meta.stackoverflow.com/q/425430/6309). Thank you again for your feedback. – VonC Jul 05 '23 at 20:36
  • 1
    @Qunatized I ([finally!](https://meta.stackoverflow.com/questions/425430/how-can-i-avoid-an-error-occurred-submitting-the-edit-error-message)) managed to update the answer, including your feedback. Thank you for your patience. – VonC Jul 10 '23 at 18:45
  • @VonC Great, thank you! – Qunatized Jul 11 '23 at 00:22