You are correct that closures are currently effectively trait objects, that is, they store a function pointer pointing to their actual code (similar to std::function
in C++).
This is known to be insufficent, and is not the final design, there is actually current work happening on "unboxed closures", which fix this, by making Rust's closures like C++11's where each closures gets a unique type that implements the appropriate methods to make it callable. (The current proposal for Rust consists of 3 traits for full flexibility (Fn
, FnMut
and FnOnce
), see the RFC link above for more details).
After that, filter
might look like
fn filter<'r, F: FnMut<(&A,), bool)>(self, predicate: F) -> Filter<A, Self, F>
(There may be sugar, so the bound can be written F: |&A| -> bool
, or even more sugar making it possible to write something like predicate: impl |&A| -> bool
directly, without the additional type parameter (although this won't work with .filter
specifically, since it needs to pass the type parameter into the return type).)
Under this scheme, it will still be possible to have an erased function type (e.g. to stop code bloat, or to store many different closures in some data structure), via exactly the same mechanism that trait objects work, writing something like predicate: &mut FnMut<(&A,), bool>
, but these will not be used in iterator adaptors.
Also, it is possible for LLVM to inline the dynamic closures now, but it is definitely not as easy or as guaranteed as it is with the statically-dispatched unboxed closures, e.g.
fn main() {
for _ in range(0, 100).filter(|&x| x % 3 == 0) {
std::io::println("tick") // stop the loop being optimised away
}
}
compiles to the following optimised LLVM IR (via rustc --emit=ir -O
):
; Function Attrs: uwtable
define internal void @_ZN4main20h9f09eab975334327eaa4v0.0E() unnamed_addr #0 {
entry-block:
%0 = alloca %str_slice, align 8
%1 = getelementptr inbounds %str_slice* %0, i64 0, i32 0
%2 = getelementptr inbounds %str_slice* %0, i64 0, i32 1
br label %match_else.i
match_else.i: ; preds = %loop_body.i.backedge, %entry-block
%.sroa.012.0.load1624 = phi i64 [ 0, %entry-block ], [ %3, %loop_body.i.backedge ]
%3 = add i64 %.sroa.012.0.load1624, 1
%4 = srem i64 %.sroa.012.0.load1624, 3
%5 = icmp eq i64 %4, 0
br i1 %5, label %match_else, label %loop_body.i.backedge
match_else: ; preds = %match_else.i
store i8* getelementptr inbounds ([4 x i8]* @str1233, i64 0, i64 0), i8** %1, align 8
store i64 4, i64* %2, align 8
call void @_ZN2io5stdio7println20h44016c4e880db7991uk11v0.11.0.preE(%str_slice* noalias nocapture nonnull %0)
br label %loop_body.i.backedge
loop_body.i.backedge: ; preds = %match_else, %match_else.i
%exitcond = icmp eq i64 %3, 100
br i1 %exitcond, label %join4, label %match_else.i
join4: ; preds = %loop_body.i.backedge
ret void
}
In particular, the filter call is completely inlined: it's all in the match_else.i:
block, you can see the %4 = srem i64 ..., 3
call is the % 3
piece of code, and the icmp eq i64 %4, 0
is the == 0
bit.