Questions tagged [punycode]

Punycode is a encoding syntax by which a Unicode (UTF-8) string of characters can be translated into the basic ASCII-characters permitted in network host names. Examples: mañana.com, bücher.com and café.com.

Punycode is a encoding syntax by which a Unicode (UTF-8) string of characters can be translated into the basic ASCII-characters permitted in network host names. Punycode is used for internationalized domain names, in short IDN or IDNA (Internationalizing Domain Names in Applications).

For example, when you would type café.com in your browser, your browser (which is the IDNA-enabled application) first converts the string to punycode "xn--caf-dma.com", because the character 'é' is not allowed in regular domain names. Punycode domains won't work in older browsers.

Examples:

  • mañana.com
  • bücher.com
  • café.com.
80 questions
1
vote
2 answers

PHP : issue with idn_to_utf8(). Certain domains are not converted

In a PHP project I use the idn_to_utf8 function to convert domaine name from punycode to unicode string. But sometimes this function return the punycode and not the unicode string. Example : echo idn_to_utf8('xn--fiq57vn0d561bf5ukfonh1o'); // Return…
Samuel Dauzon
  • 10,744
  • 13
  • 61
  • 94
1
vote
1 answer

punycode and .рф cyrillic domain redirect

I have a website with cyrillic domain name. There is an authorization lib which redirects the user to login page, but the url is somehow missformed. The website is on CodeIgniter and the redirect function used is the standard redirect function of…
MR.GEWA
  • 833
  • 1
  • 15
  • 37
1
vote
1 answer

Is an IDN valid in a src attribute or does it have to be Punycode-encoded?

In an UTF-8 encoded HTML document, is it valid to use an IDN as a value for src and href attributes? ICT Are there any objections that enforce the use of the Punycode-encoded version?
dakab
  • 5,379
  • 9
  • 43
  • 67
1
vote
1 answer

Calling feed parsing from web autodecode Punycode for IDN in .NET

I have RSS feed http://xn--d1abbgf6aiiy.xn--p1ai/feeds When I add this feed via Web ASP.Net MVC App and call method for parsing feed, feed properties are auto-converted from ASCII to Unicode representation in properties. When I call same code from…
Radenko Zec
  • 7,659
  • 6
  • 35
  • 39
0
votes
1 answer

Like Box from url with punycode?

I´m trying to make a Like Box from this url:http://www.facebook.com/pages/I-karriären/238394972905409?sk=wall I try generate the code from facebook developer page:http://developers.facebook.com/docs/reference/plugins/like-box/ but only get this…
0
votes
1 answer

is there possible to write my own punycode converter in php without intl extension?

I do not have that much control of the remote server to install extensions, php is 5.3.8. But I've noticed that there is possible to split utf-8 string with pcre. So for example: preg_split('@@u','bücher',-1,PREG_SPLIT_NO_EMPTY); gives: Array ( [0]…
rsk82
  • 28,217
  • 50
  • 150
  • 240
0
votes
2 answers

There is something wrong with my Puny code

I wanted to create a function that using pure JavaScript and without any dependencies. However, my function doesn't seem to provide the correct result. I intended for the domain to still be accessible and functional in web browsers. This is what I…
Rakushoe
  • 118
  • 11
0
votes
1 answer

IDNA Encode Adding Apostrophes and letter B?

I am using the IDNA library to encode/decode unicide domain names but when I encode a domain name, it adds apostrophes either side of the string and prepends the letter b? For example: import idna print(idna.encode('español.com')) Output:…
Mr Fett
  • 7,979
  • 5
  • 20
  • 21
0
votes
0 answers

ICU's IDNA/punycode API doesn't lowercase names by default?

I wrote a small test program (see the end of this post) that uses libicu's uidna_IDNToASCII function to punycode a Unicode domain name. $ g++ -std=c++11 -W -Wall test.cpp -licucore $ ./a.out EXAMΠLE.com xn--examle-s0e.com $ ./a.out…
Quuxplusone
  • 23,928
  • 8
  • 94
  • 159
0
votes
2 answers

An official or defacto EXAMPLE.COM for IDNs?

I'm writing unit tests and want a publicly queryable IDN that I can use in my tests. Does IANA or another body maintain an IDN equivalent to example.com? If not, is there a defacto alternative that serves the same purpose and is reliable? I need an…
Tenders McChiken
  • 1,216
  • 13
  • 21
0
votes
0 answers

LIKE search in punycode values in postgresql

When storing a domain in punycode, postgresql loses the ability to match the LIKE operator by the original, non-encoded value. For example, if we encode smth like примерен.site (example in Bulgarian), we get xn--e1aahqhhhd.site in our database. It…
fabricius
  • 1
  • 1
0
votes
0 answers

Which hostnames do browsers display as punycode?

When I visit the URL http://gibts.überhaupt.nicht (German for: does.not.exist.at.all), my browser (Chrome) internally converts this to punycode http://gibts.xn--berhaupt-55a.nicht because of the umlaut character ü, but it still displays it as…
Heiko Theißen
  • 12,807
  • 2
  • 7
  • 31
0
votes
1 answer

Convert non-ASCII email addresses to punycode in java

I am new to developing and I have the following task. I need to be able to able write and retrieve from my database international email addresses, i.e. addresses that contain non-ascii characters like æ, ø, å, ö, ä, ß, ü. In order to do this I need…
0
votes
2 answers

Ubuntu 22.04 IDN domain.com idn: could not convert from ASCII to UTF-8

In Ubuntu 20.04 and older (And Debian 11, 10 and 9) I can convert Puny code domains with idn to UTF-8 / IDN format: idn -t --quiet -a "xxx-tést.eu" Works fine or the other way around: idn -t --quiet -u "xn--xxx-tst-fya.eu" Also the conversion back…
0
votes
0 answers

Why domains containing one emoji redirect me to other unsafe and existing sites?

I was editing a text in vim and I typed gx on a play-button emoji to open it as an url and to see what happen. Vim translated the UTF-8 emoji into a punycode one and wrapped it into an url format for my browser: xn--g1h.com. The request had been…