4

I would like to resolve IP(v4) addresses to owner organizations, from the registry of IP address allocations. To do it, I don't want to become an expert in whois protocols and templates or the structure of the registries themselves. I just want a function that takes an IP address (allocated anywhere in the world) and returns a short string like "IBM Corporation". The same thing I would find by typing "whois n.n.n.n" and eyeballing the result. Reverse DNS is not what I want. Should be free software and run on Linux.

Incredibly to me, I can't find this. The whois program (on Debian) and other user-oriented front-ends give me a result for any IP address, but in all sorts of raw formats. I've found whois libraries that parse results, but they seem to assume I'm a whois expert and know which registry has the records for my query. I think the pieces just need to be put together, but nobody seems to have done it. Have I missed something, or is it easier than I think?

As a bonus, I would like to maintain a cache of these lookups. The cache should store the network range for whois results so that it returns a hit for another IP address in the same network. Ideally, the cache should perform better than a linear search as it grows.

The purpose? I would find this incredibly helpful for analyzing server logs. Reverse DNS is mostly useless thse days, but I would still like some idea of who's responsibly for requests.

Andrew
  • 5,611
  • 3
  • 27
  • 29

4 Answers4

2

I think I found a better approach to this problem. I was wrong to think that reverse DNS is useless: there is more to reverse DNS than I knew! For example, given the IP address 8.12.3.96, there is no PTR record for 96.3.12.8.in-addr.arpa:

host -t ptr 96.32.12.8.in-addr.arpa
Host 96.32.12.8.in-addr.arpa not found: 3(NXDOMAIN)

But I just learned that you can query the delegation records for 3.12.8.in-addr.arpa:

host -t ns 32.12.8.in-addr.arpa 
32.12.8.in-addr.arpa name server dns1.textdrive.com.
32.12.8.in-addr.arpa name server dns2.textdrive.com.
32.12.8.in-addr.arpa name server dns3.textdrive.com.
32.12.8.in-addr.arpa name server dns4.textdrive.com.

Pretty informative! We can look for the common suffix and associate the address with the textdrive.com domain.

I know this because jdresolve does it (with the --recursive option). And it can cache. This seems to be a great tool for analyzing network logs, with a clever and innovative way to resolve "unresolvable" IP addresses. It accomplishes the same thing I was trying to do using WHOIS.

Andrew
  • 5,611
  • 3
  • 27
  • 29
1

My service http://ipinfo.io offers an API returns the company name as the org field:

$ curl http://ipinfo.io/198.252.206.16
{
  "ip": "198.252.206.16",
  "hostname": "stackoverflow.com",
  "city": null,
  "region": null,
  "country": "US",
  "loc": "38.0000,-97.0000",
  "org": "AS25791 Stack Exchange, Inc."
}

You can get just that field by adding /org to the URL:

$ curl http://ipinfo.io/198.252.206.16/org
AS25791 Stack Exchange, Inc.

Adding your own client-side caching shouldn't be too tricky. You can find out more details about the API at http://ipinfo.io/developers.

Ben Dowling
  • 17,187
  • 8
  • 87
  • 103
1

There is no real set format for whois information. You will have to parse through the data and make guesses. I suggest looking for OrgName:, Organisation:, Organization:, and there are probably plenty of others.

If you are just doing this for your own sites, I recommend using an Analytics package to do this work for you. Google Analytics is great but does not analyze your web server's logs. You would have to use something like Web Trends.

Brad
  • 159,648
  • 54
  • 349
  • 530
  • It's a shame to me that there isn't a community effort to write and collect the parsers for the different formats. :-( My need is more complex than simple web logs, and I don't even know how something like Web Trends would integrate with what I'm doing (not to mention I have neither the inclination nor the budget for a commercial package.) Thanks for the ideas. – Andrew Oct 15 '10 at 19:17
  • Then I recommend simply starting by parsing your own. I doubt the list of possible formats *that* extensive. Knock 'em out one by one until you cover all of the formats that you can find. – Brad Oct 15 '10 at 19:36
0

As Brad correctly pointed out in his answer, there is no standard, no way to detect the same information for all responses.

You need to create one parser for each response format, and it requires a really huge effort.

One year ago I started the project of creating a pure-ruby WHOIS client and parser. The library is open-source, so feel free to fork it and contribute back.

Currently it provides more than 150 different parsers. Not all parsers support the Organization information, but the library has a very flexible DSL so you can easily add it.

Simone Carletti
  • 173,507
  • 49
  • 363
  • 364