Require compiler to emit branchless/constant-time code

Question

In cryptography, any piece of code that depends on secret data (such as a private key) must execute in constant time in order to avoid side-channel timing attacks.

The most popular architectures currently (x86-64 and ARM AArch64) both support certain kinds of conditional execution instructions, such as:

CMOVcc, SETcc for x86-64
CSINCcc, CSINVcc, CSNEGcc for AArch64

Even when such instructions are not available, there are techniques to convert a piece of code into a branchless version. Performance may suffer, but in this scenario it's not the primary goal -- running in constant time is.

Therefore, it should in principle be possible to write branchless code in e.g. C/C++, and indeed it is seen that gcc/clang will often emit branchless code with optimizations turned on (there is even a specific flag for this in gcc: -fif-conversion2). However, this appears to be an optimization decision, and if the compiler thinks branchless will perform worse (say, if the "then" and "else" clauses perform a lot of computation, more than the cost of flushing the pipeline in case of a wrongly predicted branch), then I assume the compiler will emit regular code.

If constant-time is a non-negotiable goal, one may be forced to use some of the aforementioned tricks to generate branchless code, making the code less clear. Also, performance is often a secondary and quite important goal, so the developer has to hope that the compiler will infer the intended operation behind the branchless code and emit an efficient instruction sequence, often using the instructions mentioned above. This may require rewriting the code over and over while looking at the assembly output, until a magic incantation satisfies the compilers -- and this may change from compiler to compiler, or when a new version comes out.

Overall, this is an awful situation on both sides: compiler writers must infer intent from obfuscated code, transforming it into a much simpler instruction sequence; while developers must write such obfuscated code, since there are no guarantees that simple, clear code would actually run in constant time.

Making this into a question: if a certain piece of code must be emitted in constant-time (or not at all), is there a compiler flag or pragma that will force the code to be emitted as such, even if the compiler predicts worse performance than the branched version, or abort the compilation if it is not possible? Developers would be able to write clear code with the peace of mind that it will be constant-time, while supplying the compiler with clear and easy to analyze code. I understand this is probably a language- and compiler-dependent question, so I would be satisfied with either C or C++ answers, for either gcc or clang.

You probably need to write assembler code yourself. I doubt that any general purpose compiler would help you. — Phil1970, Jul 10 '21 at 14:55
Note that just makig the code branchless is not enough -- cache effects may cause variations in runtime that leak information, and that may be impossible to avoid on some hardware. For real security, you need to carefully design the hardware as well. — Chris Dodd, Jul 10 '21 at 16:27
@ChrisDodd sure there are many hurdles, but they can be tackled one at a time. This question deals with the issue of branchless code. Moreover, for the better or worse, much crypto code is executed on general purpose processors. Perfect is the enemy of good enough. — swineone, Jul 10 '21 at 21:45
If you are interested in running code securely on a general purpose processor, this is the wrong question. You should be asking "how can my secret info leak out of this system, and how do I prevent that?" For timing leaks, delaying the result until a fixed time is much better than trying to write branchless code, as it also avoids other timing issues that may occur. — Chris Dodd, Jul 10 '21 at 22:38
I'm unfamiliar with this technique. Can you cite any peer-reviewed papers espousing this technique? My understanding is that branchless constant-time code is the golden standard, at least in academia. Moreover, I'm fairly certain that your suggested technique would not be resistant to power analysis attacks when applicable (smart cards, embedded devices, etc.) — swineone, Jul 11 '21 at 04:11
I'm not sure, but I think performing both sides of a condition, then using a function (or asm containing macro that depends on both inputs) that's opaque to the compiler and selects a value could be an option (as said opaqueness would prevent the compiler from optimizing either branch away). It probably won't make for very readable code, though. Alternately, one could possibly write a compiler plugin to automate the process. — Hasturkun, Jul 20 '21 at 11:29
Probably you would want a specialized compiler designed for this purpose, instead of trying to retrofit it onto an existing compiler that has spent 30+ years evolving toward the opposite goal (most efficient code). For that matter, C might be the wrong language to start with, since its semantics make this hard. (For example, code like `a = cond ? *p : 0;` can't be made branchless, because `p` is allowed to be an invalid pointer when `cond` is false, so the compiler has to make sure the load doesn't happen at all in that case.) — Nate Eldredge, Aug 01 '21 at 17:38

flying_duck · Answer 1 · 2022-10-24T18:09:12.947

I found this question by going down a similar rabbit hole. For security purposes I require my code to not branch on secret data and to not leak information trough timing attacks.

While not an answer per se I can recommend this paper from the S&P 2018: https://ieeexplore.ieee.org/document/8406587. The authors also wrote and extension for CLang/LLVM. I am not sure how well this extension works but it's a first step and gives a good overview on where we currently stand in the research context.

Require compiler to emit branchless/constant-time code

1 Answers1

Linked