0

I know this problem is not fully answerable, and will probably never be 100% solvable.

But I am looking for ways/techniques to determine if an IP is belongs to a normal home/business user, going through a normal ISP, or if it's an IP from a hosting company or other type of network/system/server that is very unlikely to be a normal user on a normal ISP connection. I am really interested to hear about any ways people here have figured this out, as best as they could.

Are there lists that might associate netblocks with hosting companies?

Is there anyway to differentiate an ISP from a hosting provider, based on any public information?

If you were tasked with solving this problem, as much as you could, what approach would you take?

My knowledge of networks and how the whole internet fits together isn't bad, but I'm more of a software engineer than a network guy and I don't really understand what information is where or how to obtain it. I know that companies can have netblocks assigned to them, and that that information is somewhere, and that's about it.

Any help, greatly appreciated.

EDIT: I'm interested in automated ways of doing this; not a human visiting a website per IP and looking for clues that it belongs to a hosting IP. So in other words, I'd have a database or something, with information and a way to determine, from that data, whether an IP is hosting or ISP, for example.

Conor
  • 23
  • 1
  • 4
  • 6
    In order to avoid the [XY problem](http://mywiki.wooledge.org/XyProblem), can you tell us what you intend to do with the information? Why is this needed and what's the goal? – ewwhite Jul 22 '14 at 15:43
  • Basically, it is going to be used for statistical purposes, i.e., who the visitors to a site are (human vs. bot/scrapers/search engines/hosting), and also will be used as a way to determine what web content to present to the end user/system depending on whether who they are. We are not trying to block anyone, and performance is not a concern. It would be quite hard to explain my idea here, without writing an essay. I'm just looking for methods/techniques on how/where to obtain netblock type data, from various sources, and somehow use that data to see if an IP is ISP or hosting. – Conor Jul 22 '14 at 16:14
  • 2
    Sounds like you want to be investigating web analytics services. Don't reinvent the wheel. If you're not happy with the data you're receiving from existing analytics platforms, then it's likely because there's just not enough data available to reliably make the kinds of distinctions you're looking for – dbr Jul 22 '14 at 16:21

3 Answers3

1

EDIT: From your additional comments, it sounds like you want to be investigating web analytics services. Don't reinvent the wheel. If you're not happy with the data you're receiving from existing analytics platforms, then it's likely because there's just not enough data available to reliably make the kinds of distinctions you're looking for.


You may be able to find out some useful information by searching for the IP address on the appropriate Regional Internet Registry (RIR) database. Some info on what RIRs are and which regions they cover.

A tool such as DomainTools WHOIS will do the work of finding the appropriate RIR, querying the database and presenting you with the results.

The results may or may not help you. If they clearly show the IP address as being part of a block of IP addresses owned by a well-known consumer/small business ISP, then that's a strong indication that the traffic originated from that sort of user. Alternatively, the results might not help you much.

dbr
  • 1,852
  • 3
  • 23
  • 38
  • Hey, no, I'm not looking for analytical services. I need to be able to figure this information out on-the-fly. I am not looking for a third party service that I can query. I am looking for raw *data*, or a way to retrieve that data incrementally, if necessary. Again, it's hard to explain here, but I don't want to be querying third parties over HTTP or relying on that. If there is raw data out there, I'd like information on where and how to understand it, if it's possible. – Conor Jul 22 '14 at 16:57
  • Analytics services would present the information near enough in realtime, and would do lots of nice fancy reports. I'm guessing that's not what you're after? Are you looking for something more like an API or programmatic "feed" that you can read from from some other bit of software? – dbr Jul 22 '14 at 17:02
  • Hey, thanks for the response. What would they present that would say whether something was a hosting company, or an ISP? And nope, I'm not looking for reports. I need to make decisions, in code, based on what I think the IP is. Performance is a non-issue. Even if was from a huge list of hardcoded, IP ranges marked as "hosting companies". I'm not looking for an API. I am looking for the raw data, the knowledge to understand it, I can build my own API on top of, for myself, or use whatever way I see fit. – Conor Jul 22 '14 at 17:59
  • I don't think it's possible to do what you want with any sort of accuracy. The data that the RIRs hold is pretty much all there is available when it comes to who owns an IP address (and therefore where the traffic might be coming from). You could do a reverse DNS lookup on the IP, but again you'd need to look at the host name returned (if anything) and try and make a judgement as to whether it's a run of the mill ISP or not. If you were to try and codify that personal judgement into an algorithm, it would be incredibly difficult and wildly inaccurate. – dbr Jul 22 '14 at 19:08
  • As an example, services like Google Analytics often sort traffic into sources, such as "Virgin Media", "Sky", "BT" (I'm from the UK) based on RIR data, and then chuck everything else into some sort of "other" category. To do even that, they've probably manually created lookup tables for the popular ISPs and the strings that match them in the WHOIS data (I'm guessing here, but I'd be surprised if I was wildly off) – dbr Jul 22 '14 at 19:12
0

Try a whois lookup of the IP in order to find information about the provider or organization.

ewwhite
  • 197,159
  • 92
  • 443
  • 809
0

You'll find several of commercial Geo Location service providers have generated/collated such IP-addrress_to_location and IP-address_to_ISP data sets already and provide API's for querying those databases as well.

Use cases vary from "Hello visitor from" to preventing online fraud "customer from IP-address in Nigeria, orders with credit card issued in France in an American online store for delivery to Mexico."

HBruijn
  • 77,029
  • 24
  • 135
  • 201