0

I am retrieving the domain from a URL string but I'm wondering what is the best option to get it and avoid getting "co" in "example.co.uk" URLs.

Does anybody know about an algorithm or .NET framework method to do it?

I've been looking for it and the answers were to match the list of all TLDs and currently this list is growing quickly.

Edit:

I've already tried the Uri class and haven't found what I'm looking for.

I try to retrieve the first subdomain of a string like "website.example.co.uk" which would be "example.co.uk".

tittodiego
  • 156
  • 10
  • 2
    Hint: The URI class has most of the stuff you'll need (don't know about this one, haven't checked). Start there. For a better, more concise and thought-through answer, post *exactly* what you are doing right now. Post the expected outcome, given an *actual* URI. – Arran Oct 07 '13 at 11:59
  • Thanks. Added an example. I've already tried Uri and it doesn't look to have what I'm looking for. – tittodiego Oct 07 '13 at 12:11
  • @Arran: `Uri` doesn't split the Host. – Daniel Hilgarth Oct 07 '13 at 12:12
  • @DanielHilgarth, nope, but without his edit it didn't sound like he had even *looked* at it. – Arran Oct 07 '13 at 13:24

2 Answers2

0

Once you've got the hostname from the URI, it would be easy enough to check if it ends with ".co.uk", and if so, extract the last 3 components; otherwise extract the last 2 components. It sounds like that would accomplish what you're asking for; do you actually want something more general?

Michael Dyck
  • 2,153
  • 1
  • 14
  • 18
  • Thanks for answering, but I would need something more general, not only for ".co.uk". I've made a workaround but I'm not very happy with it. – tittodiego Oct 09 '13 at 10:45
0

Your problem has no solution today, and everyone shares the same pain.

There was an IETF working group (DBOUND) that was scheduled to find solutions for this problem, they were various propositions but nothing went up to live as a standard, and the group closed. If you are interested: https://datatracker.ietf.org/wg/dbound/about/

Now the only help that exists today is using the "Public Suffix List", at https://publicsuffix.org/ : make sure to read all explanations and understand that this is a manually curated list so not updated in real-time and errors can happen.

For .uk you will find there the current list of TLDs handled by the registry:

// uk : https://en.wikipedia.org/wiki/.uk
// Submitted by registry <Michael.Daly@nominet.org.uk>
uk
ac.uk
co.uk
gov.uk
ltd.uk
me.uk
net.uk
nhs.uk
org.uk
plc.uk
police.uk
*.sch.uk
Patrick Mevzek
  • 10,995
  • 16
  • 38
  • 54