1

When I define an async function in main.rs, the async function generates a poll function and a context initialization function, like this:

pub async fn hello_world()
{
    println!("hello world!!!");
}

#[tokio::main]
async fn main() {
    hello_world().await;
}

From the symbol table, it can be seen that async hello_world function generates two functions: Poll function in Future trait and context initialization function. Use command readelf -sW hello_world to see the names in the symbol table are _ZN15test_helloworld11hello_world28_$u7b$$u7b$closure$u7d$$u7d$17hfb37a71ab872a722E and _ZN15test_helloworld11hello_world17h68b666166140c723E.

But when I defined asyn functions in lib.rs, there was no poll function,like this:

pub async fn hello_world()
{
    println!("hello world!!!");
}

pub async fn start_world()
{
    hello_world().await;
}

After compiling rust lib into an obj using the command cargo rustc -- --emit=obj, it can be seen from the symbol table that the hello_worldfunction only has a context initialization function and no Future poll function. Use command readelf -sW xxx.o to see the names in the symbol table, only has _ZN14helloworld_lib11hello_world17h3e453d3cd706914eE, Missing poll function in obj.

This makes me very confused. In rust, async is a syntax sugar that will generate a poll function after disaggregating the sugar, but why did the compiler not generate it? When was this poll function generated? Can I add compilation options to allow the rust compiler to generate this poll function?

super_jh
  • 21
  • 3
  • 1
    It's likely to be an optimization, some sort of "async inlining". Although it would be weird for a crate-pub function to be missing symbols. – Filipe Rodrigues Aug 08 '23 at 18:42

1 Answers1

0

Async functions return types that implement the Future trait. Machine code for trait implementations may not be generated until it is needed, by some use of the trait implementation; in this case, the compiler compiling the use-site gets the implementation code as Rust MIR (mid-level intermediate representation), not machine code. (I don't know why the compiler makes this choice; it may be because generic code is very often a candidate for inlining, so emitting machine code functions that would be rarely called is not worthwhile.)

We can force an implementation to be included by writing a plain function that explicitly returns a dyn Future. Vtables for dyn dispatch are generated when, and only when, the Rust code coerces a concrete type (such as the return type of start_world()) to a dyn type (such as dyn Future), so writing the following function will force a vtable, and the functions in it, to be generated, because returning the Box performs such a coercion:

pub fn concrete_start() -> Pin<Box<dyn Future<Output = ()>>> {
    Box::pin(start_world())
}

(Pinning isn't necessary for this demonstration, but would be present in any practical function returning this type. It doesn't change the machine code.)

Now let's look at the generated code with cargo-show-asm. You don't need to know very much about the assembly language for your processor (in this example, x86) to be able to tell what is present, because you can just look for the presence of interesting labels/symbols. concrete_start() appears:

scratchpad::concrete_start:
Lfunc_begin4:
    push rbp
    mov rbp, rsp
    mov rax, qword ptr [rip + ___rust_no_alloc_shim_is_unstable@GOTPCREL]
    movzx eax, byte ptr [rax]
    mov edi, 2
    mov esi, 1
    call ___rust_alloc
    test rax, rax
    je LBB4_2
    mov word ptr [rax], 0
    lea rdx, [rip + l___unnamed_1]
    pop rbp
    ret
LBB4_2:
    mov edi, 1
    mov esi, 2
    call alloc::alloc::handle_alloc_error

Of course, most of this is the Box allocation. But l___unnamed_1 is in the vtable for the Future implementation:

l___unnamed_1:
    .quad   core::ptr::drop_in_place<scratchpad::start_world::{{closure}}>
    .asciz  "\002\000\000\000\000\000\000\000\001\000\000\000\000\000\000"
    .quad   scratchpad::start_world::{{closure}}

This vtable consists of a pointer to the type's drop glue (a function that knows to drop/destruct/deallocate it), its size and alignment, and a pointer to the Future::poll() implementation:

scratchpad::start_world::{{closure}}:
Lfunc_begin3:
    push rbp
    mov rbp, rsp
    push rbx
    sub rsp, 56
    mov rbx, rdi
    movzx eax, byte ptr [rdi]
    lea rcx, [rip + LJTI3_0]
    movsxd rax, dword ptr [rcx + 4*rax]
    add rax, rcx
    jmp rax

    mov byte ptr [rbx + 1], 0
    jmp LBB3_7

    movzx eax, byte ptr [rbx + 1]
    test eax, eax
    jne LBB3_4
LBB3_7:
    lea rax, [rip + l___unnamed_2]
    mov qword ptr [rbp - 56], rax
    mov qword ptr [rbp - 48], 1
    mov qword ptr [rbp - 24], 0
    lea rax, [rip + l___unnamed_3]
    mov qword ptr [rbp - 40], rax
    mov qword ptr [rbp - 32], 0
    lea rdi, [rbp - 56]
    call std::io::stdio::_print
    mov word ptr [rbx], 257
    xor eax, eax
    add rsp, 56
    pop rbx
    pop rbp
    ret
LBB3_4:
    cmp eax, 1
    jne LBB3_10
    mov esi, 35
    lea rdi, [rip + _str.0]
    jmp LBB3_11

    lea rdi, [rip + _str.1]
    lea rdx, [rip + l___unnamed_4]
    mov esi, 34
    call core::panicking::panic

    lea rdi, [rip + _str.0]
    lea rdx, [rip + l___unnamed_4]
    mov esi, 35
    call core::panicking::panic
LBB3_10:
    mov esi, 34
    lea rdi, [rip + _str.1]
LBB3_11:
    lea rdx, [rip + l___unnamed_5]
    call core::panicking::panic
    ud2

    mov byte ptr [rbx + 1], 2
    jmp LBB3_14

LBB3_14:
    mov byte ptr [rbx], 2
    mov rdi, rax
    call __Unwind_Resume

And there's your println!() call, as well as a bunch of checks for invalid state transitions, like the Future being polled after it completes. Here's the message strings for those panics:

.section __TEXT,__const
    .p2align    4, 0x0
_str.0:
    .ascii  "`async fn` resumed after completion"

    .p2align    4, 0x0
_str.1:
    .ascii  "`async fn` resumed after panicking"
Kevin Reid
  • 37,492
  • 13
  • 80
  • 108