10

I'm trying to use UNIX sockets for inter-thread communication. The program is only intended to run on Linux. To avoid creating the socket files, I wanted to use "abstract" sockets, as documented in unix(7).

However, I don't seem to be able to connect to these sockets. Everything works if I'm using "pathname" sockets, though.

Here is the code (I haven't quoted any error handling, but it's done): thread#1:

int log_socket = socket(AF_LOCAL, SOCK_STREAM, 0);
struct sockaddr_un logaddr;
socklen_t sun_len = sizeof(struct sockaddr_un);
logaddr.sun_family = AF_UNIX;
logaddr.sun_path[0] = 0;
strcpy(logaddr.sun_path+1, "futurama");
bind(log_socket, &logaddr, sun_len);
listen(log_socket, 5);
accept(log_socket, &logaddr, &sun_len);
... // send - receive

thread#2:

struct sockaddr_un tolog;
int sock = socket(AF_LOCAL, SOCK_STREAM, 0);
tolog.sun_family = AF_UNIX;
tolog.sun_path[0] = 0;
strcpy(tolog.sun_path+1, "futurama");
connect(sock, (struct sockaddr*)&tolog, sizeof(struct sockaddr_un));

If all I do in the above code, is change the sun_path to not have leading \0, things work perfect.

strace output:

t1: socket(PF_FILE, SOCK_STREAM, 0)         = 0
t1: bind(0, {sa_family=AF_FILE, path=@"futurama"}, 110)
t1: listen(0, 5)
t2: socket(PF_FILE, SOCK_STREAM, 0) = 1
t2: connect(1, {sa_family=AF_FILE, path=@"futurama"}, 110 <unfinished ...>
t2: <... connect resumed> )     = -1 ECONNREFUSED (Connection refused)
t1: accept(0,  <unfinished ...>

I know that the connect comes before accept, that's not an issue (I tried making sure that accept() is called before connect(), same result. Also, things are fine if the socket is "pathname" anyway).

Pawel Veselov
  • 3,996
  • 7
  • 44
  • 62
  • For communication between threads of the *same* process, an ordinary `pipe(2)` should be enough! And you could also use pipes if all the communicating processes and/or threads have the same parent process! – Basile Starynkevitch Jul 24 '12 at 23:47
  • @BasileStarynkevitch pipe will not work in my case. I need multiple threads to send info, and receive a synchronous response before moving on. – Pawel Veselov Jul 25 '12 at 00:18
  • You could have several pipes and use `poll` for multiplexing them. – Basile Starynkevitch Jul 25 '12 at 00:19
  • 1
    @BasileStarynkevitch for this, I will have to know in advance how many maximum pipes to open, or limit access to one using locks. The socket approach has less overhead for such case. – Pawel Veselov Jul 25 '12 at 00:29

4 Answers4

17

While I was posting this question, and re-reading unix(7) man page, this wording caught my attention:

an abstract socket address is distinguished by the fact that sun_path[0] is a null byte (’\0’). All of the remaining bytes in sun_path define the "name" of the socket

So, if I bzero'ed the sun_path before filling in my name into it, things started to work. I figured that's not necessarily straight-forward. Additionally, as rightfully pointed out by @davmac and @StoneThrow, the number of those "remaining bytes" can be reduced by specifying only enough length of the socket address structure to cover the bytes you want to consider as your address. One way to do that is to use SUN_LEN macro, however, the first byte of the sun_path will have to be set to !0, as SUN_LEN uses strlen.

elaboration

If sun_path[0] is \0, The kernel uses the entirety of the remainder of sun_path as the name of the socket, whether it's \0-terminated or not, so all of that remainder counts. In my original code I would zero the first byte, and then strcpy() the socket name into the sun_path at position 1. Whatever gibberish that was in sun_path when the structure was allocated (especially likely to contain gibberish since it's allocated on the stack), and was included in the length of the socket structure (as passed to the syscalls), counted as the name of the socket, and was different in bind() and connect().

IMHO, strace should fix the way it displays abstract socket names, and display all the sun_path bytes from 1 to whatever the structure length that was supplied, if sun_path[0] is 0

Pawel Veselov
  • 3,996
  • 7
  • 44
  • 62
  • so how does this answer the question? – Karoly Horvath Jul 24 '12 at 23:51
  • @KarolyHorvath I added the elaboration, let me know if it's still unclear. – Pawel Veselov Jul 25 '12 at 00:14
  • `strcpy` will add terminating zero... are you saying bytes after that alter the behaviour? – Karoly Horvath Jul 27 '12 at 11:20
  • 2
    @KarolyHorvath yes, exactly. Not altering the behavior so to speak, but just defining the socket address. *ALL* the bytes in sun_path matter, no matter what they are. – Pawel Veselov Jul 27 '12 at 22:52
  • @PawelVeselov Actually this is not quite correct (I realise this is a quite old question, but I thought it's worth noting the following). All the address matters regardless of nul termination, but the address size is that _given_. In your case you passed `sizeof(struct sockaddr_un)` as the size parameter (to `bind` and `connect`); if you had instead passed `3 + strlen("futurama")` (which includes the family, the nul prefix and the nul terminator) then your code would have worked fine. – davmac Aug 27 '15 at 09:41
  • @davmac I would not recommend playing around with passing incomplete sockaddr structures; if nothing else, it's highly unportable. – Pawel Veselov Nov 10 '15 at 20:08
  • @PawelVeselov considering the question is specifically asking about the _Linux abstract namespace_ I don't see how portability is an issue. In any case I was more trying to highlight an inaccuracy in your answer than suggest a normal mode of use. – davmac Nov 11 '15 at 10:55
  • 1
    @PawelVeselov (in any case passing an "incomplete" sockaddr structure is actually quite portable, see http://man7.org/linux/man-pages/man7/unix.7.html - under "Pathname sockets" - the recommended value of _addrlen_). – davmac Nov 11 '15 at 10:59
  • 1
    (Also, interesting, the page I linked to appears to be an updated version of the man page you quote. It says "The socket's address in this namespace is given by the additional bytes in sun_path _that are covered by the specified length_ of the address structure." - not "all of the remaining bytes"). – davmac Nov 11 '15 at 11:11
  • 1
    @PawelVeselov - This may not be useful anymore due to the age of this question, but I think the `addrlen` you're passing to `bind()` is incorrect: that value should be the length of the _address_, e.g. `SUN_LEN(logaddr.sun_path)` (and note: before and after using that macro you need to unset/set logaddr.sun_path[0] to `NULL`, respectively). – StoneThrow May 17 '17 at 20:50
  • @StoneThrow According to http://stackoverflow.com/questions/2307511/proper-length-of-an-af-unix-socket-when-calling-bind, using `sizeof` is not incorrect, but may create more waste. Otherwise, this is what @davmac was talking about earlier. – Pawel Veselov May 18 '17 at 04:01
3

The key of making sockets in abstract namespace work is providing the proper length to 'bind' and 'connect' commands. To avoid setting '\0' at the end of the address in sockaddr_un it should be copied with strncpy or alike.

It is already explained in Pawel's answer so I'm just going to give an example.

Server:

int main(int argc, char** argv)
{
  //to remove warning for unused variables.
  int dummy = argc;
  dummy = (int)argv;

  int fdServer = 0;
  int fdClient = 0;
  int iErr     = 0;
  int n = 0;
  socklen_t addr_len = 0;
  char buff[1024];
  char resp[1024];

  const char* const pcSocketName = "/tmp/test";

  struct sockaddr_un serv_addr; 

  //set the structure with 'x' instead of 0 so that we're able 
  //to see the full socket name by 'cat /proc/net/unix'
  //you may try playing with addr_len and see the actual name 
  //reported in /proc/net/unix
  memset(&serv_addr, 'x', sizeof(serv_addr));
  serv_addr.sun_family = AF_UNIX;
  serv_addr.sun_path[0] = '\0';
  //sizeof(pcSocketName) returns the size of 'char*' this is why I use strlen
  strncpy(serv_addr.sun_path+1, pcSocketName, strlen(pcSocketName));

  fdServer = socket(PF_UNIX, SOCK_STREAM, 0);
  if(-1 == fdServer) {
    printf("socket() failed: [%d][%s]\n", errno, strerror(errno));
    return(-1);
  }

  iErr = bind(fdServer, (struct sockaddr*)&serv_addr, offsetof(struct sockaddr_un, sun_path) + 1/*\0*/ + strlen(pcSocketName));

  if(0 != iErr) {
    printf("bind() failed: [%d][%s]\n", errno, strerror(errno));
    return(-1);
  }

  iErr = listen(fdServer, 1);
  if(0 != iErr) {
    printf("listen() failed: [%d][%s]\n", errno, strerror(errno));
    return(-1);
  }

  addr_len = sizeof(pcSocketName);
  while(1) {
    fdClient = accept(fdServer, (struct sockaddr*) &serv_addr, &addr_len);
    if(0 >= fdClient) {
      printf("accept() failed: [%d][%s]\n", errno, strerror(errno));
      return(-1);
    }

    memset(resp, 0, sizeof(resp));
    memset(buff, 0, sizeof(buff));
    n = recv(fdClient, buff, sizeof(buff), 0);
    if(0 > n) {
      printf("recv() failed: [%d][%s]\n", errno, strerror(errno));
      return(-1);
    }

    printf("[client]: %s\n", buff);
    sprintf(resp, "echo >> %s", buff);
    n = send(fdClient, resp, sizeof(resp), 0);
    if(0 > n) {
      printf("send() failed: [%d][%s]\n", errno, strerror(errno));
      return(-1);
    }
    printf("[server]: %s\n", resp);
  }

  close(fdServer);

  return(0);
}

Client:

int main(int argc, char** argv) {
  //to remove warning for unused variables.
  int dummy = argc;
  dummy = (int)argv;

  int fdClient = 0;
  struct sockaddr_un serv_addr;
  int iErr     = 0;
  const char* const pcSocketName = "/tmp/test";

  char buff[1024];

  memset(&serv_addr, 0, sizeof(serv_addr));
  serv_addr.sun_family = AF_UNIX;
  serv_addr.sun_path[0] = '\0';
  strncpy(serv_addr.sun_path+1, pcSocketName, strlen(pcSocketName));

  fdClient = socket(PF_UNIX, SOCK_STREAM, 0);
  if(-1 == fdClient) {
    printf("socket() failed: [%d][%s]\n", errno, strerror(errno));
    return(-1);
  }

  iErr = connect(fdClient, (struct sockaddr*) &serv_addr, offsetof(struct sockaddr_un, sun_path) + 1/*\0*/ + strlen(pcSocketName));
  if(0 != iErr) {
    printf("connect() failed: [%d][%s]\n", errno, strerror(errno));
    return(-1);
  }

  memset(buff, 0, sizeof(buff));
  sprintf(buff, "Hello from client!");


  printf("[client]: %s\n", buff);
  iErr = send(fdClient, buff, sizeof(buff), 0);
  if(0 > iErr){
    printf("write() failed: [%d][%s]\n", errno, strerror(errno));
    return(-1);
  }

  iErr = recv(fdClient, buff, sizeof(buff), 0);
  if(0 > iErr){
    printf("read() failed: [%d][%s]\n", errno, strerror(errno));
    return(-1);
  }

  printf("[server]: %s\n", buff);

  return(0);
}
Bernardo Ramos
  • 4,048
  • 30
  • 28
Jimmar6
  • 89
  • 1
  • 11
1

In my case, replacing strncpy() to snprintf() and increasing copy size to UNIX_PATH_MAX solved the problem.

Original

strncpy(server_addr.sun_path, SOCKET_PATH, sizeof(SOCKET_PATH));

Modified

snprintf(server_addr.sun_path, UNIX_PATH_MAX, SOCKET_PATH);

Hope it helps.

Steven Im
  • 98
  • 7
0

Not sure how SOCKET_PATH is defined, but if it's a string literal as I suspect, then sizeof(SOCKET_PATH) will be the size of a char*, typically either 4 or 8 bytes.

WallStProg
  • 391
  • 4
  • 8
  • Incorrect. sizeof("ABCDEF") is 7 (1 more than the number of bytes). The type of a string literal is an array, not a pointer. In most contexts an array is converted to a pointer, but they are not the same. – Per Bothner Feb 14 '20 at 01:59