1

Is there an easy way to fetch punycode domains? I tried using the requests module, but it didn't work.

The following code doesn't work:

import requests
requests.get("https://.la")
InvalidURL: Failed to parse: https://.la

using Python 3.10.4, requests==2.28.1, urllib3==1.26.13

Aviv
  • 11
  • 1
  • I tested it with a domain I have that contains é, and that worked. The issue is related to the emoji, which is basically not legal in IDNs. Emoji are symbols, which were excluded from IDNs about a decade ago (phishers were using symbols too well). See RFC 5892 pages 5-10 or [the github issue for the code you're using](https://github.com/kjd/idna/issues/18). – arnt Dec 14 '22 at 12:40

1 Answers1

0

I solved my problem using the following function

from urllib.parse import urlparse, urlunparse


def convert_idna_address(url: str) -> str:
    parsed_url = urlparse(url)
    return urlunparse(
        parsed_url._replace(netloc=parsed_url.netloc.encode("idna").decode("ascii"))
    )
Aviv
  • 11
  • 1