How to extract the pay level domain from a URL, is there any java library which automatically does this ?
Asked
Active
Viewed 148 times
-2
-
Does [this](http://stackoverflow.com/questions/1923815/get-the-second-level-domain-of-an-url-java) help? – jrook Apr 15 '17 at 20:26
-
@jrook, that's about second level domain, I need the Pay Level Domain – Noor Apr 15 '17 at 20:37
-
I think the answer to that question covers that. – jrook Apr 15 '17 at 21:19
1 Answers
1
Last time I checked I didn't find any lib and I ended up using this regex:
private static final Pattern URL_PATTERN = Pattern.compile(
"(?:^|[\\W])((ht|f)tp(s?):\\/\\/|www\\.)"
+ "(([\\w\\-]+\\.){1,}?([\\w\\-.~]+\\/?)*"
+ "[\\p{Alnum}.,%_=?&#\\-+()\\[\\]\\*$~@!:/{};']*)",
Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
Guava's InternetDomainName might be used to compose it out of the individual elements though.
Example usage:
For example, for the domain name
mail.google.com
, this method returns the list["mail", "google", "com"]
ImmutableList<String> parts = InternetDomainName.from("mail.google.com").parts()

ldz
- 2,217
- 16
- 21
-
-
-
I believe the example you mention only split the host part and returns the parts, this is something faily easy, but based on the parts, and related to your example, how do you know that "google" is the Pay Level Domain, that's y i was asking if there is any library – Noor Apr 16 '17 at 12:41
-
because i believe to do this, one must have a record of the host on the right and continually check if it's a public or private host – Noor Apr 16 '17 at 12:42