2

I am confused about why the size of Vec<i64> and VecEnum is the same in the following code:

pub enum VecEnum {
    Abc,
    Vec(Vec<i64>),
}

pub enum IntEnum {
    Abc,
    Int(i64),
}

pub fn main() {
    println!("IntEnum: {} bytes", core::mem::size_of::<IntEnum>());
    println!("i64: {} bytes", core::mem::size_of::<i64>());
    println!("VecEnum: {} bytes", core::mem::size_of::<VecEnum>());
    println!("Vec<i64>: {} bytes", core::mem::size_of::<Vec<i64>>());
}

This outputs the following:

IntEnum: 16 bytes 
i64: 8 bytes      
VecEnum: 24 bytes 
Vec<i64>: 24 bytes

For the i64 it behaves as expected: having an enum with a i64 variant requires extra space for the enum tag to be encoded. But why is this not the case for the Vec, which just consists of 3 8-byte values (ptr,len,capacity) of stack memory?

Can someone explain how the memory layout works here and what is happening under the hood?

Tadeo Hepperle
  • 550
  • 2
  • 12

1 Answers1

8

What you have here is a so-called Option-like enum:

An option-like enum is a 2-variant enum where:

  • the enum has no explicit #[repr(...)], and
  • one variant has a single field, and
  • the other variant has no fields (the "unit variant").

(Basically, there's not a significant difference between VecEnum and Option<Vec<i64>>.)

Reading further from the above link, we can see that the compiler is able to effectively use "niches" (illegal values) of the payload type (Vec<i64>) as the enum's value for the payload-less variant. This is called "discriminant elision."

The most obvious and well-known example of this is Option<&T>. Since references cannot be null, the zero value is a niche that can be used to store the None variant. This makes &T and Option<&T> have the same size.

The same thing is happening here. The first field of Vec<T> is RawVec<T> (an internal type) whose first field is a (doc-hidden) type called Unique<T>:

A wrapper around a raw non-null *mut T ...

Well, if the compiler knows that Unique<T> can't be null, then a null pointer in this field is a niche of the Vec<T> type and so this can be used as a substitute discriminant, and discriminant elision is performed.

Note in particular that "all zeroes" is not the only valid niche. Any bit-pattern that would be an illegal value for the payload type can be used. For example, if the compiler knew of a type wrapping f64 that guarantees the contained value cannot be NaN, then Option<NotNanF64> could represent None as the NaN bit-pattern. However, "null pointer in a type that doesn't allow null pointers" is easily the most common niche.

cdhowie
  • 158,093
  • 24
  • 286
  • 300