4

Given a target URI, how can I programmatically determine whether an HTTP GET of that URI would be making a request to the local machine?

Context: There are two reasons I need to do this. One is that I have a mod_perl2 application that responds to HTTP requests. In doing so, it sometimes needs to make an HTTP request to retrieve some data from a target URI. To avoid an infinite recursion of HTTP requests, I need to avoid making the HTTP request if the target URI would actually resolve to the current machine. This is to prevent users from accidentally shooting themselves in the foot. It is not intended as a security check.

The second reason is that, if my application receives an HTTP request, I need to look up some metadata using the request URI as key. The problem is that any of several URI synonyms could have been used as keys in creating the metadata, so I need a way to resolve the synonyms, but only for URIs on the local host machine.

The problem is not as simple as looking at the URI to see if the domain is "localhost", or its IP address is 127.0.0.1 (or 127.0.1.1 or 127.*), because: (a) the target URI might use a fully qualified domain name (e.g., foo.example.com) that resolves to an IP address on the current machine; and (b) a machine can have several IP addresses.

The OS must have the information needed to figure this out, since it has to know the IP addresses and ports on which it listens. This post discusses the problem of trying to determine the local machine's IP address (or addresses, since it may have several). Maybe I could do that to determine the local machine's IP addresses, and then perhaps I could compare those IP addresses against the IP address in the target URI (or the IP address returned by gethostbyname of the URI's domain). Do I really need to do that? Are there problems with that approach? Is there a better way?

This post indicates that C# has a function HttpContext.Current.Request.IsLocal to do what I need, but I have been unable to find anything similar in perl.

I previously asked this question on perlmonks.org (because I'm using perl) but found no satisfactory answer. If there is a solution available in some other programming language that is commonly available on Linux, such as C, bash or python, that would be adequate also. I do not need a solution that is guaranteed to work in every possible case, but it would be nice if it would work for most cases.

Community
  • 1
  • 1
DavidBooth
  • 101
  • 1
  • 9
  • It could also point to the IP address of a load balancer which is going to rewrite the packet to point to the local machine. – derobert Apr 30 '14 at 03:00
  • Random suggestion for the first case: On your app-generated requests, set some custom HTTP header. Check for it when receiving a request, and return an error if its present. Remember you can put as much trace info as you need in your custom header (e.g., all the nodes the request has been through—if node2 gets a node1 request, it sends the header with both node2 and node1 in it. Node3 would be OK, but either node1 or node2 would say "no") – derobert Apr 30 '14 at 03:00
  • Add a comment containing a unique string (such an md5 hash of the system's hostid or CPUID) to /robots.txt. Retrieve it via http and compare it to the robots.txt in the local filesystem. – Mark Plotnick Apr 30 '14 at 03:04
  • The HTTP header idea seems best so far. The /robots.txt idea is similar in requiring an extra HTTP request. But ideally I'd like to avoid the extra HTTP request altogether, even if it would not work in the presence of a load balancer. – DavidBooth Apr 30 '14 at 03:51
  • [.NET HttpRequest.IsLocal](http://msdn.microsoft.com/en-us/library/system.web.httprequest.islocal%28v=vs.110%29.aspx) "The IsLocal property returns true if the IP address of the request originator is 127.0.0.1 or if the IP address of the request is the same as the server's IP address." – miracle173 Apr 30 '14 at 04:02

3 Answers3

2

Since I found no better solution I ended up implementing this almost exactly as suggested by @EightBitTony and someone else on perlmonks. After getting the host out of the URI, which can be done using the perl URI module, here is the perl code I used to determine whether the host is local:

#! /usr/bin/perl -w

use strict;

use Socket;
use IO::Interface::Simple;

print "127.0.1.1  is local\n" if &IsLocalHost("127.0.1.1");
print "google.com is local\n" if &IsLocalHost("google.com");
exit 0;

################ IsLocalHost #################
# Is the given host name, which may be either a domain name or
# an IP address, hosted on this local host machine?
# Results are cached in a hash for fast repeated lookup.
sub IsLocalHost
{
my $host = shift || return 0;
our %isLocal;   # Cache
return $isLocal{$host} if exists($isLocal{$host});
my $packedIp = gethostbyname($host);
if (!$packedIp) {
    $isLocal{$host} = 0;
    return 0;
    }
my $ip = inet_ntoa($packedIp) || "";
our %localIps;      # Another cache
%localIps = map { ($_, 1) } &GetIps() if !%localIps;
my $isLocal = $localIps{$ip} || $ip =~ m/^127\./ || 0;
# TODO: Check for IPv6 loopback also.  See:
# http://ipv6exchange.net/questions/16/what-is-the-loopback-127001-equivalent-ipv6-address
$isLocal{$host} = $isLocal;
return $isLocal;
}

################ GetIps #################
# Lookup IP addresses on this host.
sub GetIps
{
my @interfaces = IO::Interface::Simple->interfaces;
my @ips = grep {$_} map { $_->address } @interfaces;
return @ips;
}
DavidBooth
  • 101
  • 1
  • 9
1

There is a naive solution to this, described as,

  1. extract the fully qualified domain name, hostname or IP address from the URI in question.
  2. resolve that to an IP address
  3. compare that against a list of IP addresses on the current host
  4. if there is a match, then this URI points to this host

This works, as long as,

  1. the URI doesn't resolve to another host which then redirects to this one
  2. the URI doesn't resolve to a load balancer which then balances back to this host
  3. the host doesn't use a Proxy which could handle the request (caching proxy) or some other device in the chain.

However, I think your question is too broad and you would be better placed breaking down into two questions,

  1. How do I extract the IP address, hostname or FQDN from a URI (and ask that on a programming site)
  2. How do I enumerate all the IP addresses on a single host (and if that host is a Linux server, ask that question here).

This isn't really an answer, but it's too long for a comment, and I suspect your question is going to be closed.

EightBitTony
  • 167
  • 1
  • 6
-2
start cmd: # ip route get 192.168.1.2
local 192.168.1.2 dev lo  src 192.168.1.2 
    cache <local>
  • I do not understand your answer. Is it written in a programming language? If so, what language? – DavidBooth Apr 30 '14 at 03:42
  • 1
    @DavidBooth it's a command. you run `ip route get 1.2.3.4`. @HaukeLaging that could use a little explanation. Such as that it's from IProute2, that you should look for the `local`, etc. – phemmer Apr 30 '14 at 03:43
  • @DavidBooth I guess if you don't understand my answer (and expected some "real prgramming") then you are wrong on this site. You should ask on Stackoverflow instead then. –  Apr 30 '14 at 04:06
  • -1 Iguess this is not the kind of information Patrick asked for – miracle173 Apr 30 '14 at 05:29
  • @miracle173 I am pleased to be judged by someone who talks about .NET on unix.sx. And now let's get this OT stuff away from here. –  Apr 30 '14 at 05:39
  • @Patrick, thanks that was the context I needed to interpret Hauke Laging's answer. It looks like I am still left with implementing this in two steps: (a) resolve the name to an IP address; and (b) check whether that IP address is local. The first step can be done by perl function gethostbyname. The second can be done either using the ip command, as suggested by Hauke Laging, or by getting a list of all local IP addresses using perl module IO::Interface::Simple, as suggested on perlmonks. – DavidBooth May 03 '14 at 06:18