The execve
syscall is a little bit expensive; it would be unreasonable to run it more than a few dozen -or perhaps a few hundreds- times per second (even if it probably lasts a few milliseconds, and perhaps a fraction of millisecond, most of the time).
It is probably faster (and cleaner) than the dozen of equivalent calls to mmap(2) (& munmap
& mprotect(2)) and setcontext(3) you'll use to nearly mimic it (and then, there is the issue of killing the running threads outside of the one doing the execve
, and other resources attached to a process, e.g. FD_CLOEXEC
-ed file descriptors).
(you won't be able to replicate with mmap
, munmap
, setcontext
, close
exactly what execve
is doing, but you might be close enough... but that would be ridiculous)
Also, the practical cost of execve
should also take into amount the dynamic loading of the shared libraries (which should be loaded before running main
, but technically after the execve
syscall...) and their startup.
The question might not mean much, it heavily depends on the actual state of the machine and on the execve
ed executabe. I guess that execve
a huge ELF binary (some executables might have a gigabyte of code segment, e.g. perhaps the mythical Google crawler is rumored to be a monolithic program with a billion of C++ source code lines and at some point it was statically linked), e.g. with hundreds of shared libraries is much longer than execve
-in the usual /bin/sh
.
I guess also that execve
from a process with a terabyte sized address space is much longer than than the usual execve
my zsh
shell is doing on my desktop.
A typical reason to execve
its own program (actually some updated version of it) is, inside a long lasting server, when the binary executable of the server has been updated.
Another reason to execve
its own program is to have a more-or-less "stateless" server (some web server for static content) restart itself and reload its configuration files.
More generally, this is an entire research subject: read about dynamic software updating, application checkpointing, persistence, etc... See also the references here.
It is the same for dumping a core(5) file: in my life, I never saw a core dump lasting more that a fraction of a second, but I did hear than on early 1990-s Cray computers, a core
dump could (pathologically) last half an hour.... So I imagine that some pathological execve
could last quite a long time (e.g. bringing a terabyte of code segment, using C-O-W techniques, in RAM; this is not counted as execve
time but it is part of the cost to start a program; and you also might have many relocations for many shared libraries.).
Addenda
For a small executable (less than a few megabytes), you might afford several hundreds execve
per second, so that is not a big deal in practice. Notice that a shell script with usual commands like ls
, mv
, ... is execve
-ing quite a lot (very often after some fork
, which it does for nearly every command). If you suspect some issues, you could benchmark (e.g. with strace(1) using strace -tt -T -f
....). On my desktop Debian/x86-64/Sid i7 3770K an execve
of /bin/ls
(by strace --T -f -tt zsh-static -c ls
) takes about 250 µs (for an ELF binary executable /bin/ls
of 118Kbytes which is probably already in the page cache), and for ocamlc
(a binary of 1.8Mbyte) about 1.3ms ; a malloc
usually takes half or a few µs ; a call to time(2) takes about 3ns (avoiding the overhead of a syscall thru vdso(7)...)