1

For debugging purposes, I use pthread_setname_np(3) and pthread_getname_np. The name passed to them is limited by TASK_COMM_LEN (see this) which is #define-d to be 16 bytes in include/linux/sched.h of the Linux kernel. I would like to slightly increase that limit on pthread_setname_np e.g. to 24, this mostly for convenience (so that is not a big deal for me).

(this is only for debugging purposes and convenience, and I am not willing to spend more than an hour of work on it; I can manage not increasing TASK_COMM_LEN, and I have enough RAM - 32 Gbytes - so that spoiling one additional 4K pages per process is not an issue to me)

I'm compiling the latest stable Linux kernel, 4.15.2 today (February 10, 2018) on Debian/Sid/x86-64. FWIW, my gcc is version 7.3.0

I just changed in include/linux/sched.h the line 167 to become

 #define TASK_COMM_LEN          24 /*was 16*/

When compiling that kernel with make deb-pkg I am getting an error:

  CC      drivers/connector/cn_proc.o
In file included from ./include/linux/kernel.h:10:0,
                 from drivers/connector/cn_proc.c:25:
drivers/connector/cn_proc.c: In function ‘proc_comm_connector’:
./include/linux/compiler.h:324:38: error: call to ‘__compiletime_assert_240’ declared with attribute error: BUILD_BUG_ON failed: sizeof(ev->event_data.comm.comm) != TASK_COMM_LEN
  _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
                                      ^
./include/linux/compiler.h:304:4: note: in definition of macro ‘__compiletime_assert’
    prefix ## suffix();    \
    ^~~~~~
./include/linux/compiler.h:324:2: note: in expansion of macro ‘_compiletime_assert’
  _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
  ^~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:47:37: note: in expansion of macro ‘compiletime_assert’
 #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                     ^~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:71:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
  BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
  ^~~~~~~~~~~~~~~~
./include/linux/sched.h:1497:2: note: in expansion of macro ‘BUILD_BUG_ON’
  BUILD_BUG_ON(sizeof(buf) != TASK_COMM_LEN); \
  ^~~~~~~~~~~~
drivers/connector/cn_proc.c:240:2: note: in expansion of macro ‘get_task_comm’
  get_task_comm(ev->event_data.comm.comm, task);
  ^~~~~~~~~~~~~
scripts/Makefile.build:316: recipe for target 'drivers/connector/cn_proc.o' failed
make[4]: *** [drivers/connector/cn_proc.o] Error 1
scripts/Makefile.build:575: recipe for target 'drivers/connector' failed
make[3]: *** [drivers/connector] Error 2
Makefile:1018: recipe for target 'drivers' failed
make[2]: *** [drivers] Error 2
scripts/package/Makefile:86: recipe for target 'deb-pkg' failed
make[1]: *** [deb-pkg] Error 2
Makefile:1345: recipe for target 'deb-pkg' failed
make: *** [deb-pkg] Error 2

Any quick and dirty solution to fix that?

(I can manage using the older 16 byte limit for TASK_COMM_LEN; it is just for convenience that I want to raise it.)

In other words, why do I have to change in more than a single place the size of thread's names? I am not sure wanting to break the kernel scheduler in some obscure ways....

alk
  • 69,737
  • 10
  • 105
  • 255
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • Increase size of `buf`. Please read and understand what the compile-time assert is supposed to achieve. In this case it makes sure that two pieces fit. You changed one, now you also have to change the other. – Ulrich Eckhardt Feb 10 '18 at 08:05
  • But isn't that somehow a bug ? I understand the compile time assert, but I don't understand why I have to change a size in more than one place! I don't have time to patch the kernel in many places... How can I be sure (not knowing much of the kernel internals) that patching at these two places is enough? – Basile Starynkevitch Feb 10 '18 at 08:05
  • I wouldn't call it a bug. It isn't the most beautiful solution to keeping two pieces in sync, though. Whether there are more places and whether some of them are even hidden, I don't know. I'd assume there are few and hope that they are guarded by those compile-time assertions, but there's no guarantee. – Ulrich Eckhardt Feb 10 '18 at 08:09
  • It is at least a defect, IMHO. – Basile Starynkevitch Feb 10 '18 at 08:10

1 Answers1

2

The buffer in question (event_data.comm.comm in struct proc_event) is part of a userspace API and ABI (messages sent over netlink sockets with the netlink family NETLINK_CONNECTOR). That's why it explicitly specifies the buffer size as 16 - TASK_COMM_LEN isn't available in a userspace API header file, and the size of the buffer in event_data can't be changed anyway as it's part of ABI.

You could try disabling the "connector" driver in the kernel config - it's in the "Device Drivers" menu, with the name "Connector - unified userspace <-> kernelspace linker". This is the driver that provides the NETLINK_CONNECTOR netlink family, and isn't needed if you don't have any userspace code relying on that feature. That will prevent this particular code from causing you issues.

You might, however, also find similar ABI issues elsewhere.

caf
  • 233,326
  • 40
  • 323
  • 462
  • What system call (from [syscalls(2)](http://man7.org/linux/man-pages/man2/syscalls.2.html)...) is using that buffer? – Basile Starynkevitch Feb 10 '18 at 13:32
  • @BasileStarynkevitch: It's sent over a netlink socket (see the netlink(7) man page) with netlink family `NETLINK_CONNECTOR`. If you don't have anything on your system using the kernel connector driver, you can compile it out. – caf Feb 11 '18 at 00:16
  • @catf: that should go into your answer. Do you have an idea about what practical software on Debian would use that netlink socket ?? – Basile Starynkevitch Feb 11 '18 at 11:51
  • 1
    @BasileStarynkevitch: I've revised the answer. It looks like the most common packages using `NETLINK_CONNECTOR` are `drbd-utils`, `lvm2` and `ulatencyd`. – caf Feb 12 '18 at 03:10