11

I'm studying socket programming, and the server socket accept() is confusing me. I wrote two scenarios for server socket accept(), please take a look:

  1. When the server socket does accept(), it creates a new (client) socket that is bound to a port that is different from the port the server socket is bound. So socket communication is done via newly bound port, and the server socket (for accept() only) is waiting for another client connection on the originally bound port.

I think this is not quite correct, because (1) a port matches to a single process and (2) socket accept is inside-process matter and single process can have multiple sockets. So thought of a second scenario, based on some of stackoverflow answers:

  1. When a server socket does accept(), it creates a new (client) socket that is not bound to any specific port. When a client communicates with the server, it uses the port that is bound to the server socket (who accept()s connections) and which client socket to actually communicate is resolved by (sourceIP, sourcePort, destIP, destPort) tuple from TCP header(?) at Transmission level (this is also suspicious because I thought socket is somewhat of an application-level object)

This scenario also raises some questions. If the socket communications still use server socket's port, i.e. client sends some messages to the server socket port, doesn't it use the server socket's backlog queue? I mean, how can messages from a client be distinguished between connect() and read() or write()? And how can they be resolved to each client socket in the server, without any port binding?

If one of my scenarios is correct, would that answer to the questions following? Or perhaps, both of my scenarios are wrong. I'd be very thankful if you could guide me to correct answers, or at least, towards some relevant texts to study.

pro-gramer
  • 166
  • 1
  • 3
  • 14
Jongbin Park
  • 659
  • 6
  • 14
  • [This](http://stackoverflow.com/questions/17731247/serversocket-accept-returns-socket-on-arbitrary-port) is one of questions I searched and it gives upvote to scenario2: – Jongbin Park Mar 29 '15 at 17:01
  • No it doesn't. There is nothing there about creating a socket that is not bound to any port. It doesn't support either of your scenarios. – user207421 Mar 29 '15 at 18:08
  • 1
    The `accept` call is ridiculously simple. If it succeeds, it creates a new socket that references a connection to one of its listening sockets that the kernel has already accepted. It's just bound to that specific TCP connection. That particular TCP connection is identified by the unique combination of source IP address, source port, destination IP address, and destination port. – David Schwartz Jan 19 '16 at 01:08

3 Answers3

12

When you create a socket and do a bind on that socket and then a listen, what you have is what is called a listening socket. When a connection is establised this socket is basically cloned to a new socket, and this socket is called the servicing socket the port to which it bound is still the same as the original port.

But there is an important distinction between this socket and the listening socket from before. Namely it is part of a socket pair. It is the socket pair that uniquely identifies the connection. so as there are 2 sockets in the picture for a socket pair, there are 2 IP adresses and 2 ports for both ends of the TCP communication channel. During the cloning of the servicing socket, the TCP kernel will allocate what is called a TCB and in it it will store those 2 IP@ and 2 ports. The TCB also contains the socket number that belongs to the TCB.

Each time a TCP segment comes in , the TCP header is checked and whether or not it is a SYN, for a SYN you would have connection establishment so that you passed already, but then the kernel is going through its list of listening sockets. If it is a normal TCP packet, not a SYN, both port numbers are in the TCP header and the IP@ are part of the IP header, so using this information the kernel is able to find the TCP that belongs to this TCP connection. (For a SYN, this information is also there, but as I said, for a SYN you have to process only the listening sockets)

That is in a nutshell how it works.

This information can be found in UNIX Network Programming: the sockets networking API. In there the link to the sockets is described whereas in other reference material it is usually not described that much in detail, rather the nitty grits of TCP are usually highlighted.

user207421
  • 305,947
  • 44
  • 307
  • 483
Philip Stuyck
  • 7,344
  • 3
  • 28
  • 39
6

When server socket do accept(), it creates a new (client) socket that is bind to port that is different from the port server socket is bind. So socket communication is done via newly bind port, and server socket (for accept() only) is waiting for another client connection on originally bind port.

No.

I think this is not quite proper answer

It is a wrong answer.

because (1) port matches to a single process

That doesn't mean anything relevant.

and (2) socket accept is inside-process matters

Nor does that. It doesn't appear to mean anything at all actually.

and single process can have multiple sockets.

That's true but it doesn't have any bearing on why your answer is wrong. The reason your answer is wrong is because no second port is used.

When server socket do accept(), it creates a new (client) socket that is not bind to any specific port

No. It creates a second socket that inherits everything from the server socket: port number, buffer sizes, socket options, ... everything except the file descriptor and the LISTENING state, and maybe I forgot something else. It then sets the remote IP:port of the socket to that of the client and puts the socket into ESTABLISHED state.

and when client communicates with the server

The client has already communicated with the server. That's why we are creating this socket.

it uses the port that is bind to server socket (who accept()s connections) and which client socket to actually communicate is resolved by (sourceIP, sourcePort, destIP, destPort) tuple from TCP header(?) at Transmission level

This has already happened.

This is also suspicious because I thought socket is somewhat application-level object)

No it isn't. A socket is a kernel-level object with an application-level file descriptor to identity it.

If the socket communications still use server socket's port, i.e. client sends some messages to server socket port, doesn't it uses server socket's backlog queue?

No. The backlog queue is for incoming connect requests, not for data. Incoming data goes into the socket receive buffer.

I mean, how can messages from client be distinguished between connect() and read() or write()?

Because a connect() request sets special bits in the TCP header. The final part of it can be combined with data.

And how can they be resolved to each client sockets in server, WITHOUT any port binding?

Port binding happens the moment the socket is created in the call to accept(). You invented this difficulty yourself. It isn't real.

If one of my scenario is correct, would answer to the questions following?

Neither of them is correct.

Or possibly I'm making two wrong scenarios, so it would be very thankful for you to provide right answers, or at least some relevant texts to study.

Surely you already have relevant texts to study? If you don't, you should read RFC 793 or W.R. Stevens, TCP/IP Illustrated, volume I, relevant chapters. You have several major misunderstandings here.

Community
  • 1
  • 1
user207421
  • 305,947
  • 44
  • 307
  • 483
  • sockets don't have binding just by mere creating them. You either bind, and then it is bound of course, or you do an active connect. During the active connect, if the socket is not binded, then an ephemeral port is allocated, not sooner, at least this is how it is for TCP/IP stack that I maintain. – Philip Stuyck Mar 29 '15 at 18:53
  • @PhilipStuyck Total confusion here. A socket created by the kernel can have a binding any time it likes. What you're describing is what an *application* does with a *client* socket. The question is about what the *kernel* does with a socket being created by `accept().` I don't know what 'during an active accept if the socket is not binded' even means. The socket is being created at this point, and it doesn't escape the kernel until it *is* bound. It's an atomic operation. There is certainly no ephemeral port allocated. You're contradicting your own answer here. – user207421 Mar 29 '15 at 19:19
  • Your statement 'Port binding happens the moment the socket is created' could be interpreted as a call to `socket()` would result in a binded socket which it isn't. During an active connect, you can work with a socket that is not explicitely bound, as such an ephemeral port will be allocated. But I guess you are talking about the socket that is created by `accept()`. I did not mean to confuse. – Philip Stuyck Mar 29 '15 at 19:19
  • @PhilipStuyck Only if you haven't read the question. My answer is about the question, which is about what the kernel does *during* the `accept()` function. It is not about what happens in client application code. Can we please stick to the point? – user207421 Mar 29 '15 at 19:29
  • @EJP, Could you please explain, how the data from the clients, being sent to the same port (specified in ServerSocket's constructor) is not mixed up between various Socket's which were created by accept() earlier? After all, all those sockets are listening to the same port, and while each of them sends the data to a specific client, the incoming data goes to the same port. Does it have something to do with TCP headers? – parsecer Aug 30 '16 at 22:52
  • 2
    @parsecer The data is identified by the tuple *{source IP, source port, target IP, target port},* and that tuple is unique to a connection. The incoming data doesn't go to a port. It goes to TCP, and from there to a socket. – user207421 Aug 30 '16 at 23:38
2

From the Linux programmer's manual, as found via man 2 accept. Link

The accept() system call is used with connection-based socket types (SOCK_STREAM, SOCK_SEQPACKET). It extracts the first connection request on the queue of pending connections for the listening socket, sockfd, creates a new connected socket, and returns a new file descriptor referring to that socket. The newly created socket is not in the listening state. The original socket sockfd is unaffected by this call.

So what happens is that you have a listening TCP socket. Someone requests to connect().

You then call accept(). The old listening socket remains in listening mode, while a new socket is created in connected mode. Port is the original listening port.

That does not interfere with the listening socket, because the new socket does not listen for incoming connections.

Marcus Müller
  • 34,677
  • 4
  • 53
  • 94
  • Thanks! So there is two (or more) modes in socket: LISTENING and CONNECTED right? May I ask you, then **when** and **how** target socket is resolved in DoD model? I've seen several diagrams connecting transmission layer and application layer via **port**, but port is not enough to get target socket. – Jongbin Park Mar 29 '15 at 17:38
  • 1
    That is another question. Post it as another question. – Marcus Müller Mar 29 '15 at 17:41
  • @ParkJongBin What do 'when and how target socket is resolved in DoD mode' and 'port is not enough to get target socket' *mean*? – user207421 Mar 29 '15 at 18:39
  • He is simply asking how for every tcp segment that comes in, how this is eventually fed to the right socket. – Philip Stuyck Mar 29 '15 at 18:46
  • @PhilipStuyck He may be asking that, but not simply. What does the 'DoD model' have to do with it for example? – user207421 Mar 29 '15 at 18:52