9

as stated in the title, my connect() call to an unix domain type socket with an according address results in the error ENOENT: no such file or directory.

The two sockets are properly initialized and the socket files are created and bound accordingly. The server and client sockets run in a different process, though the client process is fork()-ed and execl()-ed. This is also how I parse the address for the client and server socket, which I use for setting up the client socket. The server process is using pthreads.

Here is my connect() attempt:

struct sockaddr_un address;
address.sun_family = AF_UNIX;
memcpy(address.sun_path, filepath.c_str(), filepath.length());
address.sun_path[filepath.length()] = '\0';

if(-1 == connect(this->unix_domain_descriptor_.descriptor(),       \
                (struct sockaddr*)&address,                       \
                size))
{
    global::ExitDebug(-1, "connect() failed", __FILE__, __LINE__);
    return -1;
}

I tried out different values for size, such as:

//  this is from unix(7) man page. It doesn't work neither with nor without "+1"
socklen_t size =  offsetof(struct sockaddr_un, sun_path);
          size += strlen(address.sun_path) + 1;

//  this is from one of my books about linux programming
socklen_t size = sizeof(address);

//  this is from a sample code which I found at the internet
socklen_t size = sizeof(address.sun_family) + strlen(address.sun_path);

//  Update 1: 
socklen_t size = SUN_LEN(&address);

//  this is what I tried out after looking into the declaration
//  of struct sockaddr_un
socklen_t size = strlen(address.sun_path);

Surprisingly, all initializations except the last one result in an EINVAL: invalid argument error for connect() and I get the ENOENT: no such file or directory only with the last one. I even tried out entire examples from the internet, but without success. And obviously, swapping socklen_t with size_t or int doesn't change anything.

I already checked for this:

  • address.sun_path contains the correct socket file path starting from the root directory
  • address.sun_path has the length of 61 characters
  • address.sun_family is set to AF_UNIX/AF_LOCAL
  • address.sun_family has the size of 2 bytes
  • no errors at creating and binding both sockets
  • server socket is in listening state
  • sizeof(address) returns 110 as it is supposed to be

Now I was wondering why the man page example was not working and whether there had been changes that were not updated at linux.die.net or www.kernel.org. My OS is Debian Squeeze if it's from relevant.

Any ideas what I am doing wrong? And how to solve it? If you need more code or have questions, then don't hesitate to ask me (though I don't need to state this probably, but this is my first post here >.<).

btw, sorry for my bad english

Update 2

Solved. I will post it in an extra answer below out of clarity.

Noam M
  • 3,156
  • 5
  • 26
  • 41
xQuare
  • 1,293
  • 1
  • 12
  • 21
  • How about [`strcpy`](http://linux.die.net/man/3/strcpy) instead of the `memcpy` and manual placement of the null-terminator? – Jonathon Reinhart Jul 24 '12 at 13:11
  • Thanks for your reply. I tried it just now and it didn't change. The null-terminator is already set manually btw. – xQuare Jul 24 '12 at 13:14
  • die.net is a terrible site with outdated information and a lot of ads. It survives due to a heavy SEO — that's all. I'd recommend you never use it. –  Jul 24 '12 at 13:16
  • Thanks. I prefer www.kernel.org anyway. Would you recommend any other pages? Maybe I can find something there... – xQuare Jul 24 '12 at 13:19
  • Your phrase “except the last one” no longer holds after the update, does it? – MvG Jul 24 '12 at 13:40
  • You could try to `strace` the client, see whether that prints anything unexpected during the `connect`. It's a shot in the dark, though. – MvG Jul 24 '12 at 13:48
  • Yes it doesn't. I'm trying to keep it up to date as much as possible though. Thanks for your patience. I'm going to try strace now, though I need to read a bit about it first, so it will take some time. – xQuare Jul 24 '12 at 13:54
  • I either don't know how to use strace, or it does not track the child processes properly. Instead of creating a socket, it suddenly starts listening/selecting/accepting, which is never called in the child process. Am I missing something when I use it? strace -o "outputfile" -ttT -f -ff "executionfile" – xQuare Jul 24 '12 at 16:39

2 Answers2

2

After figuring out that I was handling the sockets properly, I altered my code for connect() a little and now it works. I just added this line after the declaration of my variable:

memset(&address, 0, sizeof(struct sockaddr_un));

Does anyone know why I need to set the whole variable to 0 to make it work? Should I ask this in a new topic or can I ask this here?

xQuare
  • 1,293
  • 1
  • 12
  • 21
  • 1
    There are fields in the `struct sockaddr_un` which might be read by the functions where it's passed. If these values are not initialized then undefined behavior will occur. Furthermore, many of these values default to `0`, and may have invalid values if not initialized properly. Or if they are pointers then the function would dereference a `NULL` pointer. – Iharob Al Asimi Apr 09 '16 at 19:00
  • Yes, that is why I was setting all variables that I needed, even the default values. But good that you point that out explicitly :D – xQuare Apr 10 '16 at 08:47
0

Quoting from the glibc manual:

You should compute the LENGTH parameter for a socket address in the local namespace as the sum of the size of the sun_family component and the string length (not the allocation size!) of the file name string. This can be done using the macro SUN_LEN:

  • Macro: int SUN_LEN (_struct sockaddr_un *_ PTR)
    The macro computes the length of socket address in the local namespace.

The example which follows uses a computation for which you say it fails for you:

size = (offsetof (struct sockaddr_un, sun_path)
       + strlen (name.sun_path) + 1);

But you should try out that macro. If something changed, or the example is wrong, there is still a good chance that that macro works as intended. If it does, you can look at its innards. At a first glance, it seems to me that the macro lacks the + 1 part used in all the examples. Which matches the warning from the manual to use “not the allocation size!” As your post says this didn't work without + 1 either, chances are slim though.

Out of curiosity, what's the length of the path? Have you checked whether the field provided in the structure is large enough to hold it? What is sizeof(address.sun_path) in your implementation? I wonder whether you might be copying to unreserved memory, and part of the path gets overwritten at the next function call.

MvG
  • 57,380
  • 22
  • 148
  • 276
  • Using the first my first code of initializing without "+1" returns the same value as requested in your manual (which I already had tried out before). SUN_LEN returns the very same value. Though thank you four your reply, knowing about SUN_LEN will make it easier in future. – xQuare Jul 24 '12 at 13:37
  • The path length is 61 characters. sizeof(address.sun_path) returns 108. I'm not reserving the memory dynamically so I doubt that I am copying anything to unreserved memory. – xQuare Jul 24 '12 at 13:44