4

I am trying to embed a version number into a library. Ideally, this should be a static C string that can be read and doesn't need any additional allocation for reading the version number.

On the Rust side, I am using vergen to generate the versioning information like this:

pub static VERSION: &str = env!("VERGEN_SEMVER");

and I would like to end up with something like

#[no_mangle]
pub static VERSION_C: *const u8 = ... ;

There seems to be a way to achieve this using string literals, but I haven't found a way to do this with compile time strings. Creating a new CString seems to be beyond the current capabilities of static variables and tends to end with an error E0015.

A function returning the pointer like this would be acceptable, as long as it does not allocate new memory.

#[no_mangle]
pub extern "C" fn get_version() -> *const u8 {
    // ...
}

The final type of the variable (or return type of the function) doesn't have to be based on u8, but should be translatable through cbindgen. If some other FFI type is more appropriate, using that is perfectly fine.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Michael Mauderer
  • 3,777
  • 1
  • 22
  • 49

2 Answers2

9

By ensuring that the static string slice is compatible with a C-style string (as in, it ends with the null terminator byte \0), we can safely fetch a pointer to the beginning of the slice and pass that across the boundary.

pub static VERSION: &str = concat!(env!("VERGEN_SEMVER"), "\0");

#[no_mangle]
pub extern "C" fn get_version() -> *const c_char {
    VER.as_ptr() as *const c_char
}

Here's an example in the Playground, where I used the package's version as the environment variable to fetch and called the function in Rust.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
E_net4
  • 27,810
  • 13
  • 101
  • 139
1

Now, with constant evaluation, you can wrap this up a little more nicely.

// SAFETY: the input must not have interior NUL bytes
macro_rules! static_cstr {
    ($l:expr) => {
        ::std::ffi::CStr::from_bytes_with_nul_unchecked(
            concat!($l, "\0").as_bytes()
        )
    }
}

pub static VERSION: &CStr = unsafe { static_cstr!(env!("VERGEN_SEMVER")) };

EDIT:

Inspired the discussion in the comments, we can make this safe by adding our own compile-time checks for input validity (there is an issue to stabilize const versions of the CStr methods, at which point we won't need to do this ourselves):

const fn from_bytes_with_nul(b: &[u8]) -> &CStr {
    let mut i = 0;
    while i < b.len() - 1 {
        if b[i] == b'\0' {
            panic!("interior nul byte");
        }
        i += 1;
    }

    if b[b.len() - 1] != b'\0' {
        panic!("no nul-terminator");
    }

    // SAFETY: we verify above that `b` is nul-terminated
    // and has no interior nul bytes
    unsafe { CStr::from_bytes_with_nul_unchecked(b) }
}

macro_rules! static_cstr {
    ($l:expr) => { from_bytes_with_nul(concat!($l, "\0").as_bytes()) }
}

willtunnels
  • 106
  • 4
  • Note that this is technically unsound, due to `$l` possible containing inner nulls. (In practice, this is **currently** checked when calling in a const context, but this isn't guaranteed by the function, and your macro can be used in non-const contexts). See [this answer's replies](https://stackoverflow.com/a/75064432/6655004), which use a similar macro, for more info. – Filipe Rodrigues Mar 24 '23 at 14:12
  • Also, you have a `&CStr`, when a `*const u8` or `fn() -> *const u8` is requested. These are different types since `&CStr` is a fat pointer, which keeps track of the length – Filipe Rodrigues Mar 24 '23 at 14:15
  • I have edited my response accordingly. I am surprised that it is not safe for the input to have interior NUL bytes as I would expect the resulting `CStr` to simply be the corresponding truncation. As to your latter point, the intention is that the user call `.as_ptr()` as in Shepmaster's answer. This way, however, it is encoded in the type system that their string is a valid `CStr`. Seen another way, assumption that make it is a valid `CStr` is explicit and located where the string is declared. – willtunnels Mar 25 '23 at 22:40
  • That's fair, OP could always just define a `fn get_version() -> *const u8` that simply calls `.as_ptr()` on `VERSION`. – Filipe Rodrigues Mar 25 '23 at 23:54