14

I'd like to emulate violent system shutdown, i.e. to get as close as possible to power outage on an application level. We are talking about c/c++ application on Linux. I need the application to terminate itself.

Currently i see several options:

  1. call exit()
  2. call _exit()
  3. call abort()
  4. do division by zero or dereference NULL.
  5. other options?

What is the best choice?

Partly duplicate of this question

Community
  • 1
  • 1
Drakosha
  • 11,925
  • 4
  • 39
  • 52
  • 23
    Start running your program, then reach down to your computer, and as the program is running, yank out the power cable. – Chris Lutz May 19 '09 at 05:17
  • Sorry, unacceptable solution. – Drakosha May 19 '09 at 05:18
  • So, one moment, the app is running; the next, you're staring at a command prompt. In the DOS days, I called this "dial tone". – gbarry May 19 '09 at 06:24
  • 2
    What exactly is your purpose? To completely stop the application quickly? For recovery testing? What? – Kris Kumler May 19 '09 at 22:18
  • 3
    @ChrisLutz: LOL. Short of incendiary devices, or explosives, you're going to be hard-pressed to top Chris' answer. – WhozCraig Sep 07 '12 at 08:09
  • 1
    How about running in a VM and choosing 'Reset'? You won't be able to simulate a power outage on app level, since you'll never (normally, on app level) force drivers (or the firmware) to not finish their work. – ActiveTrayPrntrTagDataStrDrvr Nov 15 '12 at 13:32

14 Answers14

28

IMHO the closest to come to a power outrage is to run the application in a VM and to power of the VM without shutting down. In all other cases where the OS is still running when the application terminates the OS will do some cleanup that would not occur in a real power outage.

lothar
  • 19,853
  • 5
  • 45
  • 59
15

At the application level, the most violent you can get is _exit(). Division by zero, segfaults, etc are all signals, which can be trapped - if untrapped, they're basically the same as _exit(), but may leave a coredump depending on the signal.

If you truly want a hard shutdown, the best bet is to cut power in the most violent way possible. Invoking /sbin/poweroff -fn is about as close as you can get, although it may do some cleanup at the hardware level on its way out.

If you really want to stress things, though, your best bet is to really, truly cut the power - install some sort of software controlled relay on the power cord, and have the software cut that. The uncontrolled loss of power will turn up all sorts of weird stuff. For example, data on disk can be corrupted due to RAM losing power before the DMA controller or hard disk. This is not something you can test by anything other than actually cutting power, in your production hardware configuration, over multiple trials.

bdonlan
  • 224,562
  • 31
  • 268
  • 324
  • Being killed by an external process is more abrupt and non-deterministic than _exit() which performs quite a lot of [clean-up](http://linux.die.net/man/2/_exit), and occurs exactly where the programmer chooses to call it. – Clifford Apr 26 '11 at 20:16
  • On Linux, `_exit()` performs exactly the same cleanup as being killed by another process - mapped memory is unmapped (and potentially freed), open file handles are closed, file locks are released, and robust mutexes are set to a stalled state. Child processes are reparented to init, and the process's parent is notified. You can't avoid this, short of crashing the kernel. – bdonlan Apr 27 '11 at 03:47
  • @dbonlan:I read that SIGKILL allows the *process* to perform no clean-up. I suppose that it is possible for the OS to do so, but that is OS dependent. Even so, the process exiting itself deterministically is very different from being terminated externally and unpredictably. So even if some clean-up is performed, _exit() does not present the most severe terminating condition. – Clifford Apr 27 '11 at 10:51
  • 1
    UNIX operating systems always do the cleanups I just mentioned. Even with SIGKILL. `_exit` is, according to POSIX, allowed to do additional cleanup such as flushing IO buffers and deleting temp files, but on Linux (which the OP asked about) it does not. – bdonlan Apr 27 '11 at 14:34
13
kill -9

It kills a process and does not allow any signal handlers to run.

Bill Lynch
  • 80,138
  • 16
  • 128
  • 173
  • 1
    +1 as this is probably as close as you can get without trying mess with allocated memory inside the application – Wayne May 19 '09 at 05:22
  • 13
    To really simulate a power failure, you're going to need to do better than kill -9 since that kills the application process dead dead dead, but won't prevent the operating system from flushing dirty write buffers to disk. In a real power outage, the application might have completed some writes that the operating system is still waiting to write to disk. Power goes out, and the pending writes vanish. – Dave W. Smith May 19 '09 at 05:47
7

Why not do a halt? Or call panic?

Charlie Martin
  • 110,348
  • 25
  • 193
  • 263
7

Try

raise(SIGKILL)

in the process, or from the command line:

kill -9 pid

where pid is the PID of your process (these two methods are equivalent and should not perform any cleanup)

user253751
  • 57,427
  • 7
  • 48
  • 90
  • 1
    Or: kill(getpid(), SIGKILL); Using 'raise()' was a C standards committee invention because the C standard does not recognize process IDs as a valid concept. – Jonathan Leffler May 19 '09 at 05:53
5

You're unclear as to what your requirements are. If you're doing tests of how you will recover from a power failure, you need to actually cause a power failure. Even doing things like a kernel panic will allow write buffers on hard disks to flush, since they are independent of the CPU.

A remote power strip might be a solution if you really need to test the complete failure case.

Don Neufeld
  • 22,720
  • 11
  • 51
  • 50
2

You could try using a virtual machine. Freeze it, screw it hard, and see what happens.

Otherwise kill -9 would be the best solution.

AndreasT
  • 9,417
  • 11
  • 46
  • 60
2

If you need the application to terminate itself, the following seems appropriate:

kill(getpid(), SIGKILL); // same as kill -9

If that's not violent enough (and it may not be), then I like the idea of terminating a VM inside which your application is running. You should be able to rig up something where the application can send a command to the host machine (via ssh or something) to terminate its own VM.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
1

Any solution where the process terminates itself programatically does not emulate an asynchronous termination in any way. It is entirely deterministic in the sense that it will terminate at the same point in the code every time.

Of your suggestions

  • exit() is defined to "terminate normally, performing the regular cleanup for terminating processes." - hardly violent!

  • _exit() performs some subset of exit()'s operations, but remains 'nice' to the application, the OS, and its resources.

  • abort() creates a SIGABRT, and the OS may choose to perform clean-up of some resources.

  • / 0 probably has similar behaviour to abort()

It is probably best not to have the application terminate itself, but have some external process kill it asynchronously so that termination may occur at any point in the execution. Use kill from another process or script on a randomised timer, to send the SIGKILL signal which cannot be trapped, and performs no clean-up. If you must have the process terminate itself, do it from some asynchronous thread that wakes up after some non-deterministic time, and kills the process, but even then you will know which thread was running when it ternminated. Even using these methods, there is no way a process can be terminated mid-cpu-cycle as a real power down might, and any cached or buffered data pending output may still appear or be written after process termination.

Clifford
  • 88,407
  • 13
  • 85
  • 165
1

I've had regression tests that we used to perform where we flicked the power switch to OFF. While doing disk IO.

Failure to recover later was, well: a failure.

You can buy reliability like that: generally you'll need an "end user certificate".

You can get there in software by talking (dirty) to your UPS. APC UPSes will definitely do power off under software control!

Who says systems can't power cycle themselves ?

Tim Williscroft
  • 3,705
  • 24
  • 37
1

Infinite recursion, should run out of stack space (if not, the OOM killer will finish the job):

void a() { a(); }

Fork bomb (if the app doesn't have any fork limits then the OOM killer should kill the app at some point):

  while(1)
    fork();

Run out of memory:

  while(1)
    malloc(1);
1

Within a single running process kill(getpid(), SIGKILL) is the most extreme, as no cleanup is possible.

Otherwise, try a VM, or put a test machine on a power strip and turn the power off, if you are doing automated testing.

  • Using a VM and killing the VM process is a great idea since it will simulate whole machine failure without killing the host, and neither the app nor the OS have any opportunity to clean-up. However if this damages the VM's OS more than the application, it does not really test the application's resilience, which I imagine is the aim. – Clifford Apr 26 '11 at 20:10
0

On a recent system, a process with superuser privileges could take realtime CPU/IO priority, lock all addressable memory, spew garbage across /proc, /dev, /sys, LAN/WiFi, firmware ioctls, and flash memory simultaneously, overclock/overvolt the CPU/GPU/RAM, and have a good chance of exiting by causing something nearby to Halt and Catch Fire.

If the process only needs to do metaphorical violence, it could stop at /proc/sysrq-trigger.

user130144
  • 101
  • 1
0

As pointed out, try to consume as much resources as possible until the kernel kills you:

while(1)
    {
    malloc(1);
    fork();
    }

Another way is trying to write to a read only page, just keep writing memory until you get a bus error.

If you can get to the kernel, a great way to kill it is simply writing over a data structure the kernel uses, bonus points if you find a page as only readable and marked as writable and then overwrite it. BTW most linux kernels allow writing to the syscall_table or interrupt table, if you write there your system will crash for sure.

A M
  • 3
  • 3
daniel
  • 9,732
  • 7
  • 42
  • 57