60

I don't understand what it means to bind a socket to any address other than 127.0.0.1 (or ::1, etc.).
Am I not -- by definition -- binding the socket to a port on my own machine.. which is localhost?
What sense does it make to bind or listen to another machine or IP address's port?
Conceptually, it just doesn't make sense to me!

(This has proven surprisingly hard to Google... possibly because I'm not Googling the right terms.)

user541686
  • 205,094
  • 128
  • 528
  • 886
  • What it means to bind a socket to *any* IP address other than *INADDR_ANY*, where 'other' *includes* `localhost`, is that it will only accept connections made to that IP address. – user207421 Apr 15 '19 at 19:58

1 Answers1

75

Binding of a socket is done to address and port in order to receive data on this socket (most cases) or to use this address/port as the source of the data when sending data (for example used with data connections in FTP server).

Usually there are several interfaces on a specific machine, i.e. the pseudo-interface loopback where the machine can reach itself, ethernet, WLAN, VPN... . Each of these interfaces can have multiple IP addresses assigned. For example, loopback usually has 127.0.0.1 and with IPv6 also ::1, but you can assign others too. Ethernet or WLAN have the IP addresses on the local network, i.e. 172.16.0.34 or whatever.

If you bind a socket for receiving data to a specific address you can only receive data sent to this specific IP address. For example, if you bind to 127.0.0.1 you will be able to receive data from your own system but not from some other system on the local network, because they cannot send data to your 127.0.0.1: for one any data to 127.0.0.1 will be sent to their own 127.0.0.1 and second your 127.0.0.1 is an address on your internal loopback interface which is not reachable from outside.

You can also bind a socket to a catch-all address like 0.0.0.0 (Ipv4) and :: (Ipv6). In this case it is not bound to a specific IP address but will be able to receive data send to any IP address of the machine.

Kevin
  • 16,549
  • 8
  • 60
  • 74
Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172
  • 1
    +1 Thanks for the answer. This raises several questions for me actually: (1) So is it correct to say that the "address" that I `bind` to is actually specifying the *interface* that I'm binding to? (2) Notwithstanding the first part, what if I have the same address on two interfaces? Will it bind to both simultaneously? (3) Is it actually true that binding to 127.0.0.1 prevents other systems from sending me packets from a security standpoint? Can't they manually send a packet that specifies that as a bogus IP address for the target? (4) Is a socket bound to a bogus address 100% unreachable? – user541686 Sep 04 '16 at 06:51
  • 1
    @Mehrdad: no, you are not binding to the interface but to the address on the interface. Bindung to 127.0.0.1 will not receive data for 127.1.1.1 even if this might be the same interface. (2) you cannot have the same IP on different interface and if you do chaos will occur (i.e. undefined). (3) yes, binding to 127.0.0.1 restricts access to all systems which can reach your 127.0.0.1 - which should be only your system. Most systems will reject or drop packets which arrive on an interface where the target address is not configured. – Steffen Ullrich Sep 04 '16 at 06:55
  • 1
    (4) if nobody is able to send you data to the bogus address then you will not be able to receive data there. This makes the socket effectively unusable in most cases. – Steffen Ullrich Sep 04 '16 at 06:56
  • Thanks! Regarding #2, I don't quite understand. For example, imagine I have a built-in and a USB Wi-Fi adapter, and each one is connected to a different router, but their IP addresses both happen to be 192.168.1.60. That should be completely possible, right? Now what happens if I bind a socket to that address? Would I get packets from either interface? – user541686 Sep 04 '16 at 07:23
  • 1
    @Mehrdad: if you have such a setup and a TCP SYN for the IP address comes in then your system might send out the ACK on a different interface because this one claims to be in the same network. That's why I mean it will be chaos, i.e. it might work or might not work or sometimes work etc. The behavior might also be depend on the specific OS. – Steffen Ullrich Sep 04 '16 at 09:40
  • Seems like an awful design, though I can believe it. Thanks for the explanation! – user541686 Sep 04 '16 at 09:45
  • 3
    @Mehrdad: no idea what is awful with that. Just imagine you have a city where you have the same street name (i.e. IP address) multiple times in different parts of the city (i.e. interface). Unless you have some other way to distinguish the streets in the address (i.e. ZIP code) chaos will happen when trying to deliver the mail. – Steffen Ullrich Sep 04 '16 at 09:53
  • Well, it's like having the address "123 4th Avenue" in two different cities, and then suddenly getting confused and chaos ensuing when people try to send you mail. It makes no sense at all -- they're in different cities! Shouldn't they be totally unrelated? – user541686 Sep 04 '16 at 10:18
  • @Mehrdad: what you call "city" is not part of the address in TCP/IP. On the wire there are only IP addresses, not cities. – Steffen Ullrich Sep 04 '16 at 10:37
  • I'm not sure you understood the analogy. "City" here corresponds to the **local** network -- the ones where hosts have addresses of the form 192.168.1.X. We're not talking about "on the wire" either, we're talking about the endpoint machine. Along the route, there is no need for any disambiguating information because nothing is ambiguous; routers only see (unique) external IP + port. The only device that sees duplicate internal addresses is the endpoint, and obviously it **knows** which device/interface it receives any packet from, so it should be able to reply on the same device/interface... – user541686 Sep 04 '16 at 11:12
  • @Mehrdad: at OSI layer 2 there is the interface. The IP address is at OSI layer 3 and things like connections (TCP) happen only at OSI layer 4. This means TCP (layer 4) has no idea which interface (layer 2) the packet came in. Apart from that there are actually use cases for asymmetric routing, where incoming and outgoing traffic takes a different path. This can for example be done in load balancing situations where one uses the load balancer to assign a server which deals with the traffic but then the server replies directly to the client. – Steffen Ullrich Sep 04 '16 at 11:48
  • Sigh. I never said asymmetric routing should be prohibited, did I? I just said the common symmetric case should work sensibly instead of resulting in chaos. Somehow you keep telling me why the design doesn't allow something so obviously sensible, and I don't know why, because I'm not disagreeing with you. I **do** understand the design doesn't allow this. That's precisely why I said it's an awful design. A sensible design would send replies along the same path by default without any trouble by default; anything else is broken. That's all. – user541686 Sep 04 '16 at 11:58
  • @Mehrdad: I'm not telling you why something sensible does not work but I'm trying to tell you that this is not sensible. Imagine that you have on two different interfaces a router with 10.0.0.1 which gives the system via DHCP an IP of 10.0.0.2. If you then try to contact 10.0.0.1 from the system - which of the two routers should it use? It will consult the routing table so probably the router which assigned the address last will win. And the same will be done with all outgoing packets - consult the routing table. – Steffen Ullrich Sep 04 '16 at 12:12
  • What I'm trying to say is that opening a socket should require specifying an interface first. There could easily be a "default" interface for a system or something. Heck, even if there was *no* ambiguity to begin with, if I were to just send a DNS request to `myip.opendns.com`, I would obviously get different responses based on which interface I sent it from. So obviously I need to specify which interface to send it from... – user541686 Sep 04 '16 at 18:30
  • @Mehrdad: the interface used to send a packet is defined by routes. Routes are usually setup together with the configuration of an IP address on a specific interface. If you send a request to myip.opendns.com it will first do a DNS lookup for the name and then find the interface to send out the request based on the routing table, i.e. usually where the default route points to. You can bind a socket to a specific interface but this is usually only done to when raw sockets want to listen on a specific interface no matter which IP is configured there. – Steffen Ullrich Sep 04 '16 at 19:17
  • "What I'm trying to say is that opening a socket should require specifying an interface first." - in effect you want the address to be "IP address + interface". But that's simply not how "address" in the IP protocol was defined in the first place. Which means that you either have to accept that having the same IP address on multiple interfaces is bad or that you have to convince everybody to change their API. – Steffen Ullrich Sep 04 '16 at 19:21
  • No, "in effect" these two are **not** the same thing. We're **only** talking about the endpoint machine here -- and the interface ID is not public information; it's only relevant to the endpoint machine itself and would never be advertised. No router would need to include the client's interface ID in the address because the IP address by itself would already uniquely identify the the destination of any packet for any router. – user541686 Sep 04 '16 at 20:30