2

What is the right regular expression to validate FQDN in C# and Javascript? I have been searching all around and I find different specifications. Which one is correct.

Few Examples I found :

   1.(?=^.{1,254}$)(^(?:(?!\d+\.|-)[a-zA-Z0-9_\-]{1,63}(?<!-)\.?)+(?:[a-zA-Z]{2,})$)

    2. (?=^.{1,254}$)(^(?:(?!\d|-)[a-zA-Z0-9\-]{1,63}(?<!-)\.?)+(?:[a-zA-Z]{2,})$)

    3. \b((?=[a-z0-9-]{1,63}\.)(xn--)?[a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,63}\b 

   (Regular Expression cook book)

Please help

Ro Yo Mi
  • 14,790
  • 5
  • 35
  • 43
Shetty
  • 1,792
  • 4
  • 22
  • 38
  • [RFC 1035](http://tools.ietf.org/html/rfc1035), http://blog.gnukai.com/2010/06/fqdn-regular-expression/ – Andreas Aug 01 '13 at 06:15
  • @Andreas : thank u. I have seen this. He says "The only deviation to the RFC rules that I make is the extra rule that the top level domain (the part that comes after the last ‘.’) must be characters only, and must be 2 or more (.com, .net, .org, .eu, .uk, ect). I can’t find where that is documented though." Not sure if that is correct. – Shetty Aug 01 '13 at 06:17
  • From [RFC 920](http://tools.ietf.org/html/rfc920) - TLD Reqs: ARPA, GOV, EDU, COM, MIL, ORG or the english two letter country code. So that seems to be a valid extension/modification. – Andreas Aug 01 '13 at 06:46
  • @Anders : What changes should i make to use it in Javascript. I tried following -> var fqdnRegEx = /(?=^.{1,254}$)(^(?:(?!\d+\.|-)[a-zA-Z0-9_\-]{1,63}(?<!-)\.?)+(?:[a-zA-Z]{2,})$)/; it gives error. – Shetty Aug 01 '13 at 06:55
  • @Andreas reflecting on this years later, it feels ancient already – Coffeeholic Mar 20 '21 at 04:43

1 Answers1

3

Generally, the Regular Expressions cookbook is a good source of information, written by two regex experts, so you should be starting there. The solution outlined there is not quite adapted to your needs yet (it doesn't validate an entire string but matches substrings, and it doesn't check for the overall length of the string), so we can modify it a little:

/^(?=.{1,254}$)((?=[a-z0-9-]{1,63}\.)(xn--+)?[a-z0-9]+(-[a-z0-9]+)*\.)+[a-z]{2,63}$/i

Explanation:

^                      # Start of string
(?=.{1,254}$)          # Assert length of string: 1-254 characters
(                      # Match the following group (domain name segment):
 (?=[a-z0-9-]{1,63}\.) # Assert length of group: 1-63 characters
 (xn--+)?              # Allow punycode notation (at least two dashes)
 [a-z0-9]+             # Match letters/digits
 (-[a-z0-9]+)*         # optionally followed by dash-separated letters/digits
 \.                    # followed by a dot.
)+                     # Repeat this as needed (at least one match is required)
[a-z]{2,63}            # Match the TLD (at least 2 characters)
$                      # End of string
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • thank u for ur answer. This works. Could you please let me know what resources i can refer to find the rules on FQDN so that i can do more testing? – Shetty Aug 02 '13 at 09:02
  • There are links to some RFCs in the comments above; those documents are pretty complicated, though. Perhaps http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names helps? – Tim Pietzcker Aug 02 '13 at 09:06
  • 3
    xn----dtbjjdcfhg5cckn1k9a.xn--p1ai Try ... and get 'false'. – diproart Jun 24 '14 at 13:06
  • @tim Dashes should be allowed after the initial `xn--`. Digits may be allowed in the TLD if the TLD is punycode. As the user above says http://xn----dtbjjdcfhg5cckn1k9a.xn--p1ai is a real host name. – mikemaccana Feb 27 '15 at 11:16