3

Regarding the syntax of hostnames, answers to questions like this often refer to RFC 1123 and RFC 952, but fail to mention RFC 921 which seems to place additional restrictions on hostnames. There are probably a bunch of later RFCs about the DNS (and IDN) which cover constraints on hostnames handled by the DNS.

There is a lot confusion around the valid syntax of hostnames and hostnames handled by the DNS.

Which RFCs specify the syntax requirements on hostnames and which RFCs specify additional constraints on the hostnames handled by the DNS?

Community
  • 1
  • 1
jotik
  • 17,044
  • 13
  • 58
  • 123

2 Answers2

3

You're correct to cite RFC 1123 and RFC 952, but you've omitted RFC 2181 "Clarifications to the DNS Specification". Specifically §11 contains this text:

... any binary string whatever can be used as the label of any resource record.

Since a "hostname" is a domain name that has an A record, this text would appear to allow any valid domain name to also be considered a valid hostname.

A couple of years ago I asked one of the authors of this text whether that was the intended interpretation and he confirmed that it was. However that view is not widely accepted and there is still no universally agreed answer within the DNS community to your question of what makes a legal hostname.

p.s. you've misread RFC 1123 - at no point does it say that 63 and 255 are lower limits on labels and domain names. The 63 limit is actually enforced by the wire format of a DNS label that only reserves 6 bits for the length of a label.

Community
  • 1
  • 1
Alnitak
  • 334,560
  • 70
  • 407
  • 495
  • Thanks! RFC 1123 in 2.1 states "Host software MUST handle host names of up to 63 characters and SHOULD handle host names of up to 255 characters." I've corrected the original question. – jotik Aug 08 '14 at 09:53
  • 2
    @Jotik Those are still upper limits. It's saying that software implementations MUST handle all legal labels, and SHOULD handle all legal domains. Nowhere is it suggesting that they should support values _higher_ than those limits. It should be read as "up to N _and no more_"! – Alnitak Aug 08 '14 at 10:01
  • The originally described restriction on "hostnames" was to match the fact that, at the time, most hosts where some form of Unix system and this was deemed to match the restrictions of the Unix operating system for a host name. Poster is correct to cite RFC 2181 but RFC4343 further updates 2181 – James Stevens May 17 '23 at 10:36
  • 1
    @JamesStevens see my comments on the other answer relating to RFC 4343. – Alnitak May 18 '23 at 17:39
1

You could take a look at the RFC 1035. This is a purely DNS based RFC and explains some of these limitations.

Community
  • 1
  • 1
NaeiKinDus
  • 730
  • 20
  • 30
  • [RFC 1035](http://tools.ietf.org/html/rfc1035) was updated by [RFC 2181](http://tools.ietf.org/html/rfc2181). [RFC 4343](http://tools.ietf.org/html/rfc4343) which updates RFC 2181 might also be relevant. – jotik Aug 08 '14 at 09:48
  • 2
    This is one of the problems with RFCs - they've been around a very long time and they're not always as mutually consistent as they should be. RFC 2181 that I've mentioned is in effect the canonical text at this point - any binary octet can be used. However the effect of 4343 is that you can't guarantee that the DNS server won't mangle any alphabetic characters therein to a different case. However 4343 says _nothing_ about non-alphanumerics, so according to 2181 they're still fair game. Don't try putting utf-8 in a domain name, though - case mangling on octets might corrupt the codepoints. – Alnitak Aug 08 '14 at 10:08
  • 1
    To circumvent the problem of UTF8 / UTF16, the [RFC 3492](https://www.ietf.org/rfc/rfc3492.txt) adds some nice "features" (the punycode, "xn--"). A classic case I've seen in the IPAM industry is that hostnames are defined using the RFC 1035 (only alphanumerics, not leading underscores and a bunch of other rules), and an exception made for Microsoft DNS records starting with "__". Just a "classical" use case though, nothing really official I'd say. – NaeiKinDus Aug 08 '14 at 11:07
  • 1
    @NaeiKinDus indeed - the punycode encoding circumvents the case sensitivity issue, and of course most registries won't let you register an actual domain name with arbitrary binary characters in anyway. The caveat is about trying to use raw binary in a label, although RFC 6891 obsoleted the special "binary label" format introduced in RFC 2671 (EDNS) – Alnitak Aug 08 '14 at 11:58
  • @NaeiKinDus although I consider RFC 1035 mandatory reading, IMHO it's one of the least useful DNS RFCs with respect to the question of legal hostname syntax. – Alnitak Aug 08 '14 at 12:02
  • @Alnitak I agree that it lacks several important details, but it states something that helped me convince customers that the compliancy check implemented in the product of a previous company I worked for was correct: [RFC 1035, Page 7](https://tools.ietf.org/html/rfc1035) gives a pretty good BNF that has the virtue of existing. At the end of the day, this topic is pretty vast and, well, kinda complex when it comes to bulletproof asserts. – NaeiKinDus Aug 08 '14 at 14:22
  • @NaeiKinDus that compliance check was probably too strict then, because that BNF no longer applies. It's now perfectly possible to have a label appear in a hostname that is completely numeric but not legal according to the RFC 1035 BNF (eg www.800.com) – Alnitak Aug 08 '14 at 14:35
  • @Alnitak indeed the BNF has been updated but several tools still do not fully support that. The 1035 was considered the "default" one for such hostnames (and for the record, the product allowed customers to change it). The rule of thumb stated at page 7 is useful IMO ("[...] the prudent user will select a name which satisfies both the rules of the domain system and any existing rules for the object, whether these rules are published or implied by existing programs."). I've had one too many funky results with Bind to ignore this. Thank you for the extended details :) – NaeiKinDus Aug 08 '14 at 14:44