3

I was browsing the Rust standard library. It cought my eyes that when a closure is passed to a function as parameter, it is passed in run-time. For example from the Iterator trait:

fn filter<'r>(self, predicate: 'r |&A| -> bool) -> Filter<'r, A, Self>

'predicate' here is not a generic parameter, but a normal, run-time parameter. But does not that imply that the compiler can not inline the call to 'predicate'? Is this a deliberate design choice [for example to avoid code bloat] or the Rust language does not provide a way to pass closures in compile time?

libeako
  • 2,324
  • 1
  • 16
  • 20

1 Answers1

8

You are correct that closures are currently effectively trait objects, that is, they store a function pointer pointing to their actual code (similar to std::function in C++).

This is known to be insufficent, and is not the final design, there is actually current work happening on "unboxed closures", which fix this, by making Rust's closures like C++11's where each closures gets a unique type that implements the appropriate methods to make it callable. (The current proposal for Rust consists of 3 traits for full flexibility (Fn, FnMut and FnOnce), see the RFC link above for more details).

After that, filter might look like

fn filter<'r, F: FnMut<(&A,), bool)>(self, predicate: F) -> Filter<A, Self, F>

(There may be sugar, so the bound can be written F: |&A| -> bool, or even more sugar making it possible to write something like predicate: impl |&A| -> bool directly, without the additional type parameter (although this won't work with .filter specifically, since it needs to pass the type parameter into the return type).)

Under this scheme, it will still be possible to have an erased function type (e.g. to stop code bloat, or to store many different closures in some data structure), via exactly the same mechanism that trait objects work, writing something like predicate: &mut FnMut<(&A,), bool>, but these will not be used in iterator adaptors.


Also, it is possible for LLVM to inline the dynamic closures now, but it is definitely not as easy or as guaranteed as it is with the statically-dispatched unboxed closures, e.g.

fn main() {
    for _ in range(0, 100).filter(|&x| x % 3 == 0) {
        std::io::println("tick") // stop the loop being optimised away
    }
}

compiles to the following optimised LLVM IR (via rustc --emit=ir -O):

; Function Attrs: uwtable
define internal void @_ZN4main20h9f09eab975334327eaa4v0.0E() unnamed_addr #0 {
entry-block:
  %0 = alloca %str_slice, align 8
  %1 = getelementptr inbounds %str_slice* %0, i64 0, i32 0
  %2 = getelementptr inbounds %str_slice* %0, i64 0, i32 1
  br label %match_else.i

match_else.i:                                     ; preds = %loop_body.i.backedge, %entry-block
  %.sroa.012.0.load1624 = phi i64 [ 0, %entry-block ], [ %3, %loop_body.i.backedge ]
  %3 = add i64 %.sroa.012.0.load1624, 1
  %4 = srem i64 %.sroa.012.0.load1624, 3
  %5 = icmp eq i64 %4, 0
  br i1 %5, label %match_else, label %loop_body.i.backedge

match_else:                                       ; preds = %match_else.i
  store i8* getelementptr inbounds ([4 x i8]* @str1233, i64 0, i64 0), i8** %1, align 8
  store i64 4, i64* %2, align 8
  call void @_ZN2io5stdio7println20h44016c4e880db7991uk11v0.11.0.preE(%str_slice* noalias nocapture nonnull %0)
  br label %loop_body.i.backedge

loop_body.i.backedge:                             ; preds = %match_else, %match_else.i
  %exitcond = icmp eq i64 %3, 100
  br i1 %exitcond, label %join4, label %match_else.i

join4:                                            ; preds = %loop_body.i.backedge
  ret void
}

In particular, the filter call is completely inlined: it's all in the match_else.i: block, you can see the %4 = srem i64 ..., 3 call is the % 3 piece of code, and the icmp eq i64 %4, 0 is the == 0 bit.

huon
  • 94,605
  • 21
  • 231
  • 225