I've a couple of websites that are subdomains (e.g., Wordpress, Altervista, Blogpress
,...).
I'm currently using url parse for splitting URLs into their elements. However it seems that does not allow to distinguish subdomains, but only tld.
Alternatively, I'd use a vocabulary to include all the subdomain suffixes and, based on that, assign 1
or 0
. But since I don't know all the blogs, I'm wondering if there is a way to make automatically the detection.
For example, I was thinking of looking at the dots, but many websites can have a dot in between not being subdomains, so this approach is not good.