I'm writing a program that uses the Netlink protocol to gather task statistics. I'm not getting very far because the kernel responds with an error to what I believe is a valid packet. I've used strace to compare the behaviour of my program with that of iotop that works correctly.
The relevant bit of the strace from iotop:
socket(PF_NETLINK, SOCK_RAW, 16) = 3 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [65536], 4) = 0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0 getsockname(3, {sa_family=AF_NETLINK, pid=-4286, groups=00000000}, [12]) = 0 sendto(3, "\x24\x00\x00\x00\x10\x00\x01\x00\x01\x00\x00\x00\x42\xef\xff\xff\x03\x00\x00\x00\x0e\x00\x02\x00\x54\x41\x53\x4b\x53\x54\x41\x54\x53\x00\x00\x00", 36, 0, NULL, 0) = 36 recvfrom(3, "\x70\x00\x00\x00\x10\x00\x00\x00\x01\x00\x00\x00\x42\xef\xff\xff\x01\x02\x00\x00\x0e\x00\x02\x00\x54\x41\x53\x4b\x53\x54\x41\x54\x53\x00\x00\x00\x06\x00\x01\x00\x17\x00\x00\x00\x08\x00\x03\x00\x01\x00\x00\x00\x08\x00\x04\x00\x00\x00\x00\x00\x08\x00\x05\x00\x04\x00\x00\x00\x2c\x00\x06\x00\x14\x00\x01\x00\x08\x00\x01\x00\x01\x00\x00\x00\x08\x00\x02\x00\x0b\x00\x00\x00\x14\x00\x02\x00\x08\x00\x01\x00\x04\x00\x00\x00\x08\x00\x02\x00\x0a\x00\x00\x00", 16384, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 112
The corresponding part of the strace output from my program:
bind(8, {sa_family=AF_NETLINK, pid=19156, groups=00000000}, 12) = 0 setsockopt(8, SOL_SOCKET, SO_SNDBUF, [65536], 4) = 0 setsockopt(8, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 sendmsg(8, {msg_name(0)=NULL, msg_iov(5)=[{"\x24\x00\x00\x00\x10\x00\x01\x00\x00\x00\x00\x00\xd4\x4a\x00\x00", 16}, {"\x03\x00\x00\x00", 4}, {"\x0e\x00\x02\x00", 4}, {"\x54\x41\x53\x4b\x53\x54\x41\x54\x53\x00", 10}, {"\x00\x00", 2}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 36 recvmsg(8, {msg_name(0)=NULL, msg_iov(1)=[{"\x38\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\xd4\x4a\x00\x00", 16}], msg_controllen=0, msg_flags=MSG_TRUNC}, 0) = 16
If I reformat these, they look a bit like this (as a hex dump): (Note that these are from different runs so the pid value will be different, but the remainder of the reformatted strace output is the same.)
sent from iotop 24000000 10000100 01000000 42efffff 03000000 0e000200 5441534b 53544154 53000000 received by iotop 70000000 10000000 01000000 42efffff 01020000 0e000200 5441534b 53544154 53000000 06000100 17000000 08000300 01000000 08000400 00000000 08000500 04000000 2c000600 14000100 08000100 01000000 08000200 0b000000 14000200 08000100 04000000 08000200 0a000000 sent from program 24000000 10000100 00000000 d44a0000 03000000 0e000200 5441534b 53544154 53000000 received by program 38000000 02000000 00000000 d44a0000
It seems to me that there are two differences.
iotop seems to use a negative value for the pid. I tried making the change so that my program also sent a negative number for the pid. This made no difference.
I use a scatter/gather approach: it's less wasteful on memory (which might be constrained in the target PC that I have in mind). However, I suspect that there are some (if not all) Netlink components that only support sending and receiving a single buffer at per request.
Does anyone know if Netlink allows scatter/gather or if it requires all communcations to be done in one large buffer at a time?