2

A couple of days ago, I encountered a strange error related to struct sockaddr_un. More about that can be read here.

After some time, I also had found the solution to that error, which was simply memset'ing the entire variable. Now my question is, as already posed in one post on that side, why?

Why do I need to set the whole locally and not dynamically allocated variable to 0, even though I had set each member to a legal and correct value?

In other struct's (for example struct sockaddr_in) is a variable, which is used as padding in order to internally look the same as struct sockaddr. You have to fill (or maybe you don't have to? Please correct me) that variable with 0 in order to make the programm run correctly at any time.

Also, where can I view source code like the one of connect()? Could it be that the connect()-function isn't implemented neatly? Or am I missing some basics?

Thank you in advance.

Community
  • 1
  • 1
xQuare
  • 1,293
  • 1
  • 12
  • 21
  • connect() is a system call. It's implemented in kernel. – Zaffy Aug 04 '12 at 15:40
  • Can anyone suggest some online frontends to browse through kernel source code? For example, I enter the function and it views me the source code? Or something like that... – xQuare Aug 04 '12 at 15:46
  • What system are you running on? – jxh Aug 04 '12 at 16:32
  • The non-custom debian squeeze. – xQuare Aug 04 '12 at 16:37
  • Are you using the abstract namespace? – jxh Aug 04 '12 at 16:38
  • I am using local file paths. I'm pretty much sure that I initialized the sun_family and sun_path properly. I used the macro constant and terminated the string with '\0'. I also used the correct length for the size of the address when calling connect(). – xQuare Aug 04 '12 at 16:45
  • 1
    @Papergay: Please try the code in [this pastebin link](http://pastebin.com/wTz6e2MW) and let me know if it works for you, because it works for me. – jxh Aug 04 '12 at 17:22
  • @Papergay: In your code, try adding a while loop for connect like I have. – jxh Aug 04 '12 at 17:32
  • 1
    `struct sockaddr_un sa; sa.sun_family = AF_UNIX; strncpy(sa.sun_path, mypath, sizeof(sa.sun_path)-1); connect(fd, sa, sizeof(sa));` shall do the initialisation and connection setup. If you get `EINVAL` something went wrong prior to running the code snipped shown above. – alk Aug 04 '12 at 17:37
  • In my above comment it should read: `... connect(fd, &sa, sizeof(sa));` – alk Aug 04 '12 at 18:08

2 Answers2

1

From reading the unix_stream_connect source code there doesn't seem to be any requirement to zero out the structure. I have written a test program that intentionally fills the structure with a value, where the client and server pass in different fill values, and they are able to connect to each other fine:

void init_addr (struct sockaddr_un *addr, const char *path, int I) {
    struct sockaddr_un tmp = { .sun_family = AF_UNIX };
    // fill sun_path
    memset(tmp.sun_path, I, UNIX_PATH_MAX);
    // copy path
    snprintf(tmp.sun_path, UNIX_PATH_MAX, "%s", path);
    *addr = tmp;
}

What might be causing the connection failure is a race condition, where the server has not yet finished its set up when the client attempts to connect.

jxh
  • 69,070
  • 8
  • 110
  • 193
  • After successfully testing the very same code that did not work for me before, I think you have to be right. Can you elaborate a bit further on that though? How long does the setup of the server takes? If I remember it right, I used to create the client even after invoking the listen() call. – xQuare Aug 04 '12 at 17:59
  • One should not get `EINVAL` in case of the server connecting to is not ready yet. – alk Aug 04 '12 at 18:12
  • 1
    What´s the sense of filling `sun_path` with `I`s? – alk Aug 04 '12 at 18:15
  • @Papergay: Not seeing your code, it is hard for me to give a meaningful answer. If adding a while loop around your `connect` call fixes it, I am confident it is a race. You can ask another question where you provide some more details about your client and server if you would like help finding the race. – jxh Aug 04 '12 at 18:29
  • @alk: The `I` is a fill value. In my test code, server filled structure with `'A'`, and client filled structure with `'B'`. But, both set the path to the same value for `accept`/`connect` respectively. – jxh Aug 04 '12 at 18:31
  • Okay, but I will need some time to alter my code... since I was working on it the past days.. I will update it soon, though. Thank you for your patience. – xQuare Aug 04 '12 at 18:31
  • Uhh, I cant seem to be able to recreate the error anymore. I did not make a copy of it either. But is what alk said about the EINVAL error now correct or not? If you are confident that a race might cause such an error, then my question is answered, though. – xQuare Aug 06 '12 at 11:22
  • @Papergay: I am confident, but the first sentence of your original question was: "as stated in the title, my connect() call to an unix domain type socket with an according address results in the error ENOENT: no such file or directory." – jxh Aug 06 '12 at 14:51
  • Yes it was. And I fixed that error at that time and asked another question in my answer, but I did not get any response (obviously). So I decided to ask this in a different topic, because it seemed very strange to me and the solution that I found should actually have made no difference. But it still solved the error I had. [This](http://stackoverflow.com/a/11648160/1548637) is what I am talking about. – xQuare Aug 06 '12 at 15:00
  • 1
    @Papergay: I think your original issue was a race, and you introduced some other issue when you were experimenting with the size of your structure. From the source code, there are only three reasons for `connect` to return `EINVAL`. (1) The len parameter is wrong, (2) the family field is wrong, and (3) the socket is not in the `CLOSED` state. – jxh Aug 06 '12 at 15:12
  • Okay, I think I understand what you are trying to say. So adding the memset() and using the initial code just caused my process/thread to take longer until it reached the `connect()` call? Which is why I did not get the `ENOENT` anymore - the server was set up by then. The `EINVAL` was not avoidable, since I parsed the wrong size to the `connect()` call anyway. So the race conditions just caused the `ENOENT` but not the `EINVAIL` error. – xQuare Aug 06 '12 at 15:27
  • 1
    @Papergay: Yes, that's what I conclude from the available evidence. – jxh Aug 06 '12 at 15:35
0

The answer depends on a couple of things.

First, if you are not using anonymous names (filename doesn't start with 0) then you do not need to clear out the structure.

If you are using anonymous names, it depends on whether you pass the "len" parameter to the various calls as "sizeof(sockaddr_un)" or as something like "len = strlen(local.sun_path+1) +1 + sizeof(local.sun_family); (This is ugly because it ignores possible structure-packing issues).

For anonymous names, the socket name includes all characters after the first 0, up to the end of the structure (as defined by len). So if you pass len as sizeof(sockaddr_un) (which is recommended) you need the entire sun_path to match between the two calls, and therefore zeroing out the structure (or setting the structures identically) is required.

For best results, readability and portability, you should use len = sizeof(sockaddr_un), and memset the sockaddr_un to 0 before initializing it.

This post should be helpful: Can not connect to Linux "abstract" unix socket

Community
  • 1
  • 1