0

I created an URL validator for my JSF webpage and now stumbled across a problem with domains where the first word (separated by dot) contains a non ASCII character.

I have following valid website url http://testä.com. Converting it to puny code using IDN.toASCII() creates invalid url: xn--http://test-v8a.com.

Should it not be http://xn--test-ooa.com/

I also checked it at german de domain manager DENIC which shows same invalid URL results.

https://www.denic.de/service/tools/idn-web-converter/

Is this a BUG in Java/RFC or am I missing something.

Workaround

When i remove the protocol at first it works.

Cœur
  • 37,241
  • 25
  • 195
  • 267
djmj
  • 5,579
  • 5
  • 54
  • 92

1 Answers1

2

The documentation is clear that this method only operates on domain name labels, so yes the protocol needs to be removed.

A label is an individual part of a domain name. The original ToASCII operation, as defined in RFC 3490, only operates on a single label. This method can handle both label and entire domain name, by assuming that labels in a domain name are always separated by dots.

Link to Javadoc: https://docs.oracle.com/javase/8/docs/api/java/net/IDN.html#toASCII-java.lang.String-int-

ck1
  • 5,243
  • 1
  • 21
  • 25