2

An enum is clearly a kind of key/value pair structure. Consequently, it would be nice to automatically create a dictionary from one wherein the enum variants become the possible keys and their payload the associated values. Keys without a payload would use the unit value. Here is a possible usage example:

enum PaperType {
    PageSize(f32, f32),
    Color(String),
    Weight(f32),
    IsGlossy,
}

let mut dict = make_enum_dictionary!(
    PaperType, 
    allow_duplicates = true,
);

dict.insert(dict.PageSize, (8.5, 11.0));
dict.insert(dict.IsGlossy, ());
dict.insert_def(dict.IsGlossy);
dict.remove_all(dict.PageSize);

Significantly, since an enum is merely a list of values that may optionally carry a payload, auto-magically constructing a dictionary from it presents some semantic issues.

  1. How does a strongly typed Dictionary<K, V> maintain the discriminant/value_type dependency inherent with enums where each discriminant has a specific payload type?

    enum Ta { 
        K1(V1), 
        K2(V2), 
        ...,
        Kn(Vn),
    }
    
  2. How do you conveniently refer to an enum discriminant in code without its payload (Ta.K1?) and what type is it (Ta::Discriminant?) ?

  3. Is the value to be set and get the entire enum value or just the payload?

    get(&self, key: Ta::Discriminant) -> Option<Ta>
    set(&mut self, value: Ta)
    

If it were possible to augment an existing enum auto-magically with another enum of of its variants then a reasonably efficient solution seems plausible in the following pseudo code:

type D = add_discriminant_keys!( T );

impl<D> for Vec<D> {
    fn get(&self, key: D::Discriminant) -> Option<D> { todo!() }
    fn set(&mut self, value: D) { todo!() }
}

I am not aware whether the macro, add_discriminant_keys!, or the construct, D::Discriminant, is even feasible. Unfortunately, I am still splashing in the shallow end of the Rust pool, despite this suggestion. However, the boldness of its macro language suggests many things are possible to those who believe.

Handling of duplicates is an implementation detail.

Enum discriminants are typically functions and therefore have a fixed pointer value (as far as I know). If such values could become constants of an associated type within the enum (like a trait) with attributes similar to what has been realized by strum::EnumDiscriminants things would look good. As it is, EnumDiscriminants seems like a sufficient interim solution.

A generic implementation over HashMap using strum_macros crate is provided based on in the rust playground; however, it is not functional there due to the inability of rust playground to load the strum crate from there. A macro derived solution would be nice.

George
  • 2,451
  • 27
  • 37
  • What would be the exact type of `dict`? – user4815162342 Dec 22 '21 at 08:29
  • 4
    I feel this use-case is better suited by a `struct` with optional fields. See a [suggestion](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1eb90970fc009f12a95dd8d4cca47e39). – kmdreko Dec 22 '21 at 08:32
  • I do use such macros but with a struct to automatically 1) create the struct with all its fields 2) set the default values by creating a default instance 3) read the values from a configuration file. The goal is to have very fast accesses, unlike maps, for skin entries. See https://github.com/Canop/broot/blob/master/src/skin/style_map.rs#L28. IMO such macro is quite direct to write and should not be generic but tailored for your use case – Denys Séguret Dec 22 '21 at 08:46
  • 4
    A Rust enum is not a key/value pair structure in the sense that it is a Sum Type, as opposed to a struct, which is a Product Type (see https://en.wikipedia.org/wiki/Algebraic_data_type). If an enum was a dictionary, it would only ever store one key/value pair at a time. So I second @kmdreko's suggestion that such a macro would make more sense with a struct. – SirDarius Dec 22 '21 at 11:08
  • What OP might want is an enum for the keys, and a struct for the dictionary. – Denys Séguret Dec 22 '21 at 11:12
  • 2
    *If it were possible to augment an existing enum [...] with another enum of of its variants* — [`EnumDiscriminants`](https://docs.rs/strum/latest/strum/derive.EnumDiscriminants.html) – Shepmaster Dec 22 '21 at 19:42

2 Answers2

2

First, like already said here, the right way to go is a struct with optional values.

However, for completeness sake, I'll show here how you can do that with a proc macro.


When you want to design a macro, especially a complicated one, the first thing to do is to plan what the emitted code will be. So, let's try to write the macro's output for the following reduced enum:

enum PaperType {
    PageSize(f32, f32),
    IsGlossy,
}

I will already warn you that our macro will not support brace-style enum variants, nor combining enums (your add_discriminant_keys!()). Both are possible to support, but both will complicate this already-complicated answer more. I'll refer to them shortly at the end.

First, let's design the map. It will be in a support crate. Let's call this crate denum (a name will be necessary later, when we'll refer to it from our macro):

pub struct Map<E> {
    map: std::collections::HashMap<E, E>, // You can use any map implementation you want.
}

We want to store the discriminant as a key, and the enum as the value. So, we need a way to refer to the free discriminant. So, let's create a trait Enum:

pub trait Enum {
    type DiscriminantsEnum: Eq + Hash; // The constraints are those of `HashMap`.
}

Now our map will look like that:

pub struct Map<E: Enum> {
    map: std::collections::HashMap<E::DiscriminantsEnum, E>,
}

Our macro will generate the implementation of Enum. Hand-written, it'll be the following (note that in the macro, I wrap it in const _: () = { ... }. This is a technique used to prevent names polluting the global namespaces):

#[derive(PartialEq, Eq, Hash)]
pub enum PaperTypeDiscriminantsEnum {
    PageSize,
    IsGlossy,
}

impl Enum for PaperType {
    type DiscriminantsEnum = PaperTypeDiscriminantsEnum;
}

Next. insert() operation:

impl<E: Enum> Map<E> {
    pub fn insert(discriminant: E::DiscriminantsEnum, value: /* What's here? */) {}
}

There is no way in current Rust to refer to an enum discriminant as a distinct type. But there is a way to refer to struct as a distinct type.

We can think about the following:

pub struct PageSize;

But this pollutes the global namespace. Of course, we can call it something like PaperTypePageSize, but I much prefer something like PaperTypeDiscriminants::PageSize.

Modules to the rescue!

#[allow(non_snake_case)]
pub mod PaperTypeDiscriminants {
    #[derive(Clone, Copy)]
    pub struct PageSize;
    #[derive(Clone, Copy)]
    pub struct IsGlossy;
}

Now we need a way in insert() to validate the the provided discriminant indeed matches the wanted enum, and to refer to its value. A new trait!

pub trait EnumDiscriminant: Copy {
    type Enum: Enum;
    type Value;
    
    fn to_discriminants_enum(self) -> <Self::Enum as Enum>::DiscriminantsEnum;
    fn to_enum(self, value: Self::Value) -> Self::Enum;
}

And here's how our macro will implements it:

impl EnumDiscriminant for PaperTypeDiscriminants::PageSize {
    type Enum = PaperType;
    type Value = (f32, f32);
    
    fn to_discriminants_enum(self) -> PaperTypeDiscriminantsEnum { PaperTypeDiscriminantsEnum::PageSize }
    fn to_enum(self, (v0, v1): Self::Value) -> Self::Enum { Self::Enum::PageSize(v0, v1) }
}
impl EnumDiscriminant for PaperTypeDiscriminants::IsGlossy {
    type Enum = PaperType;
    type Value = ();
    
    fn to_discriminants_enum(self) -> PaperTypeDiscriminantsEnum { PaperTypeDiscriminantsEnum::IsGlossy }
    fn to_enum(self, (): Self::Value) -> Self::Enum { Self::Enum::IsGlossy }
}

And now insert():

pub fn insert<D>(&mut self, discriminant: D, value: D::Value)
where
    D: EnumDiscriminant<Enum = E>,
{
    self.map.insert(
        discriminant.to_discriminants_enum(),
        discriminant.to_enum(value),
    );
}

And trivially insert_def():

pub fn insert_def<D>(&mut self, discriminant: D)
where
    D: EnumDiscriminant<Enum = E, Value = ()>,
{
    self.insert(discriminant, ());
}

And get() (note: seprately getting the value is possible when removing, by adding a method to the trait EnumDiscriminant with the signature fn enum_to_value(enum_: Self::Enum) -> Self::Value. It can be unsafe fn and use unreachable_unchecked() for better performance. But with get() and get_mut(), that returns reference, it's harder because you can't get a reference to the discriminant value. Here's a playground that does that nonetheless, but requires nightly):

pub fn get_entry<D>(&self, discriminant: D) -> Option<&E>
where
    D: EnumDiscriminant<Enum = E>,
{
    self.map.get(&discriminant.to_discriminants_enum())
}

get_mut() is very similar.

Note that my code doesn't handle duplicates but instead overwrites them, as it uses HashMap. However, you can easily create your own map that handles duplicates in another way.


Now that we have a clear picture in mind what the macro should generate, let's write it!

I decided to write it as a derive macro. You can use an attribute macro too, and even a function-like macro, but you must call it at the declaration site of your enum - because macros cannot inspect code other than the code the're applied to.

The enum will look like:

#[derive(denum::Enum)]
enum PaperType {
    PageSize(f32, f32),
    Color(String),
    Weight(f32),
    IsGlossy,
}

Usually, when my macro needs helper code, I put this code in crate and the macro in crate_macros, and reexports the macro from crate. So, the code in denum (besides the aforementioned Enum, EnumDiscriminant and Map):

pub use denum_macros::Enum;

denum_macros/src/lib.rs:

use proc_macro::TokenStream;

use quote::{format_ident, quote};

#[proc_macro_derive(Enum)]
pub fn derive_enum(item: TokenStream) -> TokenStream {
    let item = syn::parse_macro_input!(item as syn::DeriveInput);
    if item.generics.params.len() != 0 {
        return syn::Error::new_spanned(
            item.generics,
            "`denum::Enum` does not work with generics currently",
        )
        .into_compile_error()
        .into();
    }
    if item.generics.where_clause.is_some() {
        return syn::Error::new_spanned(
            item.generics.where_clause,
            "`denum::Enum` does not work with `where` clauses currently",
        )
        .into_compile_error()
        .into();
    }

    let (vis, name, variants) = match item {
        syn::DeriveInput {
            vis,
            ident,
            data: syn::Data::Enum(syn::DataEnum { variants, .. }),
            ..
        } => (vis, ident, variants),
        _ => {
            return syn::Error::new_spanned(item, "`denum::Enum` works only with enums")
                .into_compile_error()
                .into()
        }
    };

    let discriminants_mod_name = format_ident!("{}Discriminants", name);
    let discriminants_enum_name = format_ident!("{}DiscriminantsEnum", name);

    let mut discriminants_enum = Vec::new();
    let mut discriminant_structs = Vec::new();
    let mut enum_discriminant_impls = Vec::new();
    for variant in variants {
        let variant_name = variant.ident;

        discriminant_structs.push(quote! {
            #[derive(Clone, Copy)]
            pub struct #variant_name;
        });

        let fields = match variant.fields {
            syn::Fields::Named(_) => {
                return syn::Error::new_spanned(
                    variant.fields,
                    "`denum::Enum` does not work with brace-style variants currently",
                )
                .into_compile_error()
                .into()
            }
            syn::Fields::Unnamed(fields) => Some(fields.unnamed),
            syn::Fields::Unit => None,
        };
        let value_destructuring = fields
            .iter()
            .flatten()
            .enumerate()
            .map(|(index, _)| format_ident!("v{}", index));
        let value_destructuring = quote!((#(#value_destructuring,)*));
        let value_builder = if fields.is_some() {
            value_destructuring.clone()
        } else {
            quote!()
        };
        let value_type = fields.into_iter().flatten().map(|field| field.ty);
        enum_discriminant_impls.push(quote! {
            impl ::denum::EnumDiscriminant for #discriminants_mod_name::#variant_name {
                type Enum = #name;
                type Value = (#(#value_type,)*);

                fn to_discriminants_enum(self) -> #discriminants_enum_name { #discriminants_enum_name::#variant_name }
                fn to_enum(self, #value_destructuring: Self::Value) -> Self::Enum { Self::Enum::#variant_name #value_builder }
            }
        });

        discriminants_enum.push(variant_name);
    }

    quote! {
        #[allow(non_snake_case)]
        #vis mod #discriminants_mod_name { #(#discriminant_structs)* }

        const _: () = {
            #[derive(PartialEq, Eq, Hash)]
            pub enum #discriminants_enum_name { #(#discriminants_enum,)* }

            impl ::denum::Enum for #name {
                type DiscriminantsEnum = #discriminants_enum_name;
            }

            #(#enum_discriminant_impls)*
        };
    }
    .into()
}

This macro has several flaws: it doesn't handle visibility modifiers and attributes correctly, for example. But in the general case, it works, and you can fine-tune it more.

If you want to also support brace-style variants, you can create a struct with the data (instead of the tuple we use currently).

Combining enum is possible if you'll not use a derive macro but a function-like macro, and invoke it on both enums, like:

denum::enums! {
    enum A { ... }
    enum B { ... }
}

Then the macro will have to combine the discriminants and use something like Either<A, B> when operating with the map.

Chayim Friedman
  • 47,971
  • 5
  • 48
  • 77
  • Wow. Interesting. A lot of work. Will have to study it. Thank you. – George Dec 23 '21 at 04:27
  • 1
    This is mainly for myself, I like to solve such problems ;) – Chayim Friedman Dec 23 '21 at 05:51
  • The insert method should just take only an instance of the `enum`. From that it may derive the associated discriminant. This constraint also removes the ability to associate the wrong discriminant with a value. Note, the value is a full instance of the `enum` complete with discriminant. – George Dec 24 '21 at 07:34
  • @George I don't understand what you want to say. Yes, this is typesafe. – Chayim Friedman Dec 24 '21 at 09:17
  • I have moved the generic solution to [rust playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c632aa50a75c07ad878bf5ed4c33f546) – George Dec 25 '21 at 05:03
  • @George Now I understand, but I don't understand why you need the discriminant separately then? – Chayim Friedman Dec 26 '21 at 00:32
  • I don't understand. In the map, do you want to store the discriminant only or with the payload? If with the payload, why don't you use just `HashSet` or whatever? – Chayim Friedman Dec 26 '21 at 06:12
  • The discriminant gives `enum` values a payload-free form and thereby allow a version of it to function as a key. – George Dec 26 '21 at 06:48
  • Fundamentally we have a `map` where the key is derived from the stored value. In this treatment, the discriminant is derived from its `enum` value, allowing any `enum` definition to provide both the keys and their value types (set of possible values). (In a more general case derivation may included some of the payload.) Now, if it were possible for the discriminant to legally exist without its payload then a separate discriminant type may not be necessary since the following expression, `get(PaperType::PageSize)`, would be valid in `map`, `HashMap`. – George Dec 26 '21 at 15:24
  • Actually, `get(PaperType::PageSize)` would not be in `HashMap` but probably something more like `EnumHashMap` where `type HashFunc = Fn(PageType) -> ??` and the expression, `get(PaperType::PageSize)` would be valid for `fn get(key: ??) -> Option {..}`. – George Dec 26 '21 at 15:39
  • Actually, the ideal design might be a generic determinant,`Determinant`, allow something like the following: `type EnumHashMap = HashMap, EnumType>;` with `fn get(key: Determinant) -> Option {..}`. Not sure if such a generic `Determinant` is feasible in `rust`. – George Dec 26 '21 at 16:53
  • I fail to understand how this is different from my solution (except I used an associated type and not generic parameter, because it is more correct here). – Chayim Friedman Dec 26 '21 at 22:54
  • My response was never a critique or even a commentary on your solution, per say. I apologize if I gave that impression. I did not thoroughly review your solution being that I was still exploring the question itself. I will take a closer look at your solution shortly. Incidentally, what I did realize eventually is that a simple generic approach was possible and maybe helpful to you as a clarification when it seems that a single `set` method trait implemented over `HashMap` would suffice. – George Dec 27 '21 at 00:50
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/240462/discussion-between-george-and-chayim-friedman). – George Dec 27 '21 at 01:14
-1

Unfortunately, a couple of questions arise in that context:

  • should it be possible to use enum types only once? Or are there some which might be there multiple times?
  • what should happen if you insert a PageSize and there's already a PageSize in the dictionary?

All in all, a regular struct PaperType is much more suitable to properly model your domain. If you don't want to deal with Option, you can implement the Default trait to ensure that some sensible defaults are always available.

If you really, really want to go with a collection-style interface, the closest approximation would probably be a HashSet<PaperType>. You could then insert a value PaperType::PageSize.

Marcus Ilgner
  • 6,935
  • 2
  • 30
  • 44
  • Thanks for your response. The handling of duplicates is an implementation detail. I am not presently sure if two `PageSize` with differing payloads would constitute the same value within the `HashSet`. Ultimately, however, the idea is to somehow use the `enum` variants as lookup keys. On reflection, the returned value could be the entirety of the corresponding `enum` value and not just its payload... due to strong typing I suppose that would have to be the case for `get/set` to handle all values. – George Dec 22 '21 at 19:47