struct epoll_event memset or no memset?

Question

When browsing through code on the Internet, I often see snippets like these:

struct epoll_event event;
memset(&event, 0, sizeof(event));

This pattern seems needless to me, if event is filled out in full, but it is widespread. Perhaps to take into account possible future changes of the struct?

score 2 · Accepted Answer · answered Nov 08 '19 at 15:26

This is surely just bad copy-and-paste coding. The man page for epoll does not document any need to zero-initialize the epoll_event structure, and does not do so in the examples. Future changes to the struct do not even seem to be possible (ABI), but if they were, the contract would clearly be that any parts of the structure not related to the events you requested would be ignored (and not even read, since the caller may be passing a pointer to storage that does not extend past the original definition).

Also, in general it's at best pointless and at worst incorrect/nonportable to use memset when a structure is supposed to be zero-initialized, since the zero representation need not be the zero value (for pointer and floating point types). Nowadays this generality is mostly a historical curiosity, and not relevant to a Linux-specific interface like epoll anyway, but it comes up as well with mbstate_t which exists in fully general C, and where zero initialization is required to correctly use the associated interfaces. The correct way to zero-initialize things that need zero values, rather than all-zero-bytes representations, is with the universal zero initializer, { 0 }.

why is it impossible that future changes to epoll might see addition of fields to the `struct epoll_event`, e.g. new fields after the `data` member? — Darren Smith, Nov 08 '19 at 15:36
@user1095108: Nope, that's not even C (it's a constraint violation to write that). It is in C++. — R.. GitHub STOP HELPING ICE, Nov 08 '19 at 18:52
@DarrenSmith: Without some way for the caller to indicate presence of further fields, it's impossible because it would be a breaking change in syscall API/ABI which is against kernel policy. I'm not sure if it can be done with such an indication; it depends on the exact way the structure is used in various places that I'm not sufficiently familar with for epoll. — R.. GitHub STOP HELPING ICE, Nov 08 '19 at 18:54
@R Oh yeah, GNU extension made it compile and -pedantic revealed the issue. — user1095108, Nov 08 '19 at 19:35

Darren Smith · Answer 2 · 2019-11-08T16:08:20.040

Using memset like this can help you locate bugs faster. Consider it a defensive (even secure) style of programming.

Lets say you didn't use memset, and instead attempt to diligently fill in each member as documented by the API. But if you ever forget to fill in a field (or a later API change leads to the addition of a new field), then the value that field takes at run-time is undefined; and in practice will use whatever the memory previously held.

What are the consequences?

If you are lucky, your code will immediately fail in a nice way that can be debugged, for example, if the unset field needs a highly specific value.

If you are unlucky, your code may still work, and it may work for years. Perhaps on your current operating system the program memory somehow already held the correct value expected by the API. But as you move your code across systems and compilers, expect confusing behavior: "it works on my machine, but I don't understand why it doesn't work on yours".

So in this case, memset is helping you avoid this undeterministic behavior.

Of course, you can still profile your code, check for undefined memory, unit tests etc. Doing memset is not a replacement for those. It's just another technique to get to safe software.

struct epoll_event memset or no memset?

2 Answers2