11

In order to better understand Rusts panic/exception mechanisms, I wrote the following piece of code:

#![feature(libc)]

extern crate libc;

fn main() {
    let mut x: i32;
    unsafe {
      x = libc::getchar();
    }

    let y = x - 65;
    println!("{}", x);

    let z = 1 / y;
    println!("{}", z);
}

I wanted to check how Rust deals with division by zero cases. Originally I assumed it was either taking an unhandled SIGFPE to the face and dying or it implemented a handler and rerouted it to a panic (which can be dealt with nowadays?).

The code is verbose because I wanted to make sure that Rust does not do anything "smart" when it knows at compile-time that something is zero, hence the user input. Just give it an 'A' and it should do the trick.

I found out that Rust actually produces code that checks for zero division every time before the division happens. I even looked at the assembly for once. :-)

Long story short: Can I disable this behaviour? I imagine for larger datasets this can have quite a performance impact. Why not use our CPUs ability to detect this stuff for us? Can I set up my own signal handler and deal with the SIGFPE instead?

According to an issue on Github the situation must have been different some time ago.

I think checking every division beforehand is far away from "zero-cost". What do you think? Am I missing something obvious?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Lazarus535
  • 1,158
  • 1
  • 8
  • 23
  • I'm confused. Do you want to use checked division or disable checking if the divisor is zero? – squiguy Mar 02 '17 at 00:23
  • 1
    "Why not use our CPUs ability to detect this stuff for us?" Not all CPUs have that ability. x86 raises an exception on division by zero, but others don't; e.g. ARM silently returns a result of 0. – Nate Eldredge Mar 12 '21 at 16:04

2 Answers2

13

I think checking every division beforehand is far away from "zero-cost". What do you think?

What have you measured?

The number of instructions executed is a very poor proxy of performance; vectorized code is generally more verbose, yet faster.

So the real question is: what is the cost of this branch?

Since intentionally dividing by 0 is rather unlikely, and doing it by accident is only slightly more likely, the branch will always be predicted correctly except when a division by 0 occurs. But then, given the cost of a panic, a mispredicted branch is the least of your worries.

Thus, the cost is:

  • a slightly fatter assembly,
  • an occupied slot in the branch predictor.

The exact impact is hard to qualify, and for math-heavy code it might have an impact. Though I would remind you that an integer division is ~100 cycles1 to start with, so math-heavy code will shy away from it as much as possible (it's maybe THE single most time consuming instruction in your CPU).

1 See Agner Fog's Instruction Table: for example on Intel Nehalem DIV and IDIV on 64-bits integrals have a latency of 28 to 90 cycles and 37 to 100 cycles respectively.


Beyond that, rustc is implemented on top of LLVM, to which it delegates actual code generation. Thus, rustc is at the mercy of LLVM for a number of cases, and this is one of them.

LLVM has two integer division instructions: udiv and sdiv.

Both have Undefined Behavior with a divisor of 0.

Rust aims at eliminating Undefined Behavior, so has to prevent division by 0 to occur, lest the optimizer mangles the emitted code beyond repair.

It uses a check, as recommended in the LLVM manual.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • 2
    _integer division is ~100 cycles_ <- That was really the key point for me. With such a cost, the branch is really negligible. I had different numbers in my head (2-4 cycles). Very good answer, thx! – Lazarus535 Mar 02 '17 at 13:47
  • I did some tests with C/C++ and a similar scenario and both do not check anything. Also compiled with Clang(++). It somehow puzzles me that Clang does not honor this recommendation of LLVM, given how "close" they are. – Lazarus535 Mar 02 '17 at 13:50
  • 2
    @Lazarus535: Ah, but in C and C++ integer division has undefined behavior (as per specs), therefore it's "normal" for Clang to invoke `udiv` and `sdiv` without checking beforehand. – Matthieu M. Mar 02 '17 at 13:55
  • 2
    @Lazarus535 Clang is explicitly allowed by the C standard to produce undefined behavior when dividing by zero. Rust on the other hand avoids undefined behavior in safe code. Rust would remain within its promises if it offered an `unsafe fn divide_unchecked(divisor: T)` method on integer types, but it doesn't exist, likely for reasons explained in the answer. – user4815162342 Mar 02 '17 at 13:55
  • This cost analysis is missing the largest cost: this checking makes your functions much larger so they are less likely to be inlined, which can be arbitrarily expensive because of the opportunity cost of other lost optimizations. There are also a finite number of entries in the branch prediction table, and if systemically every single division in your program is generating this extra code it becomes much more likely to be a significant effect. Rust already uses a SIGSEGV handler for zero-cost stack overflow detection, why not SIGFPE for zero cost divide by zero detection? – Joseph Garvin Mar 15 '21 at 15:28
  • @JosephGarvin: With LLVM, that's indeed a concern, as AFAIK LLVM is not yet able of performing selective outlining of cold code blocks. At the same time, inlining is based on heuristics to start with, so if it matters you need to use manual hints. With regard to your second remark, you are mistaken in assuming that SIGFPE can handle this => it cannot. The problem of Undefined Behavior is that it affects the behavior of the optimizer; because dividing by 0 is UB, LLVM is allowed to assume that the dividend is not 0, and optimize accordingly. No signal will catch that. – Matthieu M. Mar 15 '21 at 16:05
  • @MatthieuM seems like LLVM IR needs a new instruction for division where divide by zero is defined to trap :) – Joseph Garvin Mar 15 '21 at 16:11
  • @JosephGarvin Yes, there's definitely a gap in LLVM IR for "poisoning" arithmetic instructions. This applies to divide by 0, but also all additions/subtractions/multiplications that may overflow. You essentially have the choice between UB and the debug instructions used by UBSan. It would be great to have a middle tier that poisons the value/sets a flag causing the computation to fail later -- with fuzzy reporting of what went wrong, they could likely be faster than the debug instructions. – Matthieu M. Mar 15 '21 at 16:42
5

Long story short: Can I disable this behaviour?

Yes you can: std::intrinsics::unchecked_div(a, b). Your question also applies to remainder (thats how Rust calls modulo): std::intrinsics::unchecked_rem(a, b). I checked the assembly output here to compare it to C++.

In the documentation it states:

This is a nightly-only experimental API. (core_intrinsics)

intrinsics are unlikely to ever be stabilized, instead they should be used through stabilized interfaces in the rest of the standard library

So you have to use the nightly build and it is unlikely to ever come in a stabilized form to the standard library for the reasons Matthieu M. already pointed out.

Community
  • 1
  • 1
spackogesicht
  • 66
  • 2
  • 3
  • Thank you for this very interesting answer! Tough decision, but i would accept this now. Any objections? – Lazarus535 Aug 06 '18 at 10:58
  • @Lazarus535 I think you asked two questions: "_Long story short: Can I disable this behaviour?_" and "_I think checking every division beforehand is far away from "zero-cost". What do you think? Am I missing something obvious?_". I answered the first one and Matthieu M. answered the second one. No objections from me though :). – spackogesicht Aug 06 '18 at 16:14