What is the ideal & fastest way to communicate between kernel and user space?

Question

I know that information exchange can happen via following interfaces between kernel and user space programs

system calls
ioctls
/proc & /sys
netlink

I want to find out

If I have missed any other interface?
Which one of them is the fastest way to exchange large amounts of data? (and if there is any document/mail/explanation supporting such a claim that I can refer to)
Which one is the recommended way to communicate? (I think its netlink, but still would love to hear opinions)

You will get better suited answers if you tell what you plan to do exactly. — lothar, Jun 02 '09 at 22:49
trying to transfer network packets from kernel to user space program and vice versa. — Methos, Jun 16 '09 at 10:25
yes it is, except for the fact that its a lossy connection. Once the queues are full, data is dropped without any notification. — Methos, Jul 07 '09 at 19:08

shodanex · Answer 1 · 2009-06-03T12:28:00.337

The fastest way to exchange vast amount of data is memory mapping. The mmap call can be used on a device file, and the corresponding kernel driver can then decide to map kernel memory to user address space. A good example of this is the Video For Linux drivers, and I suppose the frame buffer driver works the same way. For an good explanation of how the V4L2 driver works, you have :

You can't beat memory mapping for large amount of data, because there is no memcopy like operation involved, the physical underlying memory is effectively shared between kernel and userspace. Of course, like in all shared memory mechanism, you have to provide some synchronisation so that kernel and userspace don't think they have ownership at the same time.

score 6 · Answer 2 · edited May 23 '17 at 10:32

6

Shared Memory between kernel and usespace is doable.

http://kerneltrap.org/node/14326

For instructions/examples.

You can also use a named pipe which are pretty fast.

All this really depends on what data you are sharing, is it concurrently accessed and what the data is structured like. Calls may be enough for simple data.

Linux kernel /proc FIFO/pipe

Might also help

good luck

edited May 23 '17 at 10:32

Community

1
1

answered Jun 02 '09 at 22:46

Aiden Bell

28,212
4
75
119

score 5 · Answer 3 · answered Jul 07 '09 at 05:49

You may also consider relay (formerly relayfs):

"Basically relayfs is just a bunch of per-cpu kernel buffers that can be efficiently written into from kernel code. These buffers are represented as files which can be mmap'ed and directly read from in user space. The purpose of this setup is to provide the simplest possible mechanism allowing potentially large amounts of data to be logged in the kernel and 'relayed' to user space."

http://relayfs.sourceforge.net/

Thanks. But it seems as if its one way traffic. I also want to transfer back to kernel. — Methos, Jul 07 '09 at 19:10

score 2 · Answer 4 · answered Jun 03 '09 at 04:29

You can obviously do shared memory with copy_from_user etc, you can easily set up a character device driver basically all you have to do is make a file_operation structures but this is by far not the fastest way. I have no benchmarks but system calls on moderns systems should be the fastest. My reasoning is that its what's been most optimized for. It used to be that to get to from user -> kernel one had to create an interrupt, which would then go to the Interrupt table(an array) then locate the interrupt handlex(0x80) and then go to kernel mode. This was really slow, and then came the .sysenter instruction, which basically makes this process really fast. Without going into details, .sysenter reads form a register CS:EIP immediately and the change is quite fast. Shared memory on the contrary requires writing to and reading from memory, which is infinitely more expensive than reading from a register.

Surely you meant "several orders of magnitude" instead of "infinitely" [/nitpick] — Piskvor left the building, Jun 03 '09 at 07:53
a system call still requires a context switch and saving/restoring registers, regardless of whether you `int` or `sysenter`. shmem writes go into the CPU cache, not directly to memory, so are fast unless cache misses. — Peter Cordes, Dec 09 '09 at 19:55

Peter Teoh · Answer 5 · 2015-06-22T03:01:14.837

1

Here is a possible compilation of all the possible interface, although in some ways they overlapped one another (eg, socket and system call are both effectively using system calls):

Procfs
Sysfs
Configfs
Debugfs
Sysctl
devfs (eg, Character Devices) 
TCP/UDP Sockets
Netlink Sockets 
Ioctl
Kernel System Calls
Signals
Mmap

edited Jun 22 '15 at 03:01

answered Jun 22 '15 at 01:45

Peter Teoh

6,337
4
42
58

score 1 · Answer 6 · answered Jun 27 '18 at 06:38

As for shared memory , I've found that even with NUMA the two thread running on two differrent cores communicate through shared memory still required write/read from L3 cache which if lucky (in one socket)is about 2X slower than syscall , and if(not on one socket ),is about 5X-UP slower than syscall,i think syscall's hardware mechanism helped.

What is the ideal & fastest way to communicate between kernel and user space?

6 Answers6