2

I'm writing some basic bioinformatics code to transcribe DNA to RNA:

pub enum DnaNucleotide {
    A,
    C,
    G,
    T,
}

pub enum RnaNucleotide {
    A,
    C,
    G,
    U,
}

fn transcribe(base: &DnaNucleotide) -> RnaNucleotide {
    match base {
        DnaNucleotide::A => RnaNucleotide::A,
        DnaNucleotide::C => RnaNucleotide::C,
        DnaNucleotide::G => RnaNucleotide::G,
        DnaNucleotide::T => RnaNucleotide::U,
    }
}

Is there a way to get the compiler to do an exhaustivity check also on the right side of the match statement, basically ensuring a 1-1 mapping between the two enums?

(A related question: The above is probably better represented with some kind of bijective map, but I don't want to lose the exhaustivity checking. Is there a better way?)

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
rump roast
  • 754
  • 11
  • 26
  • 2
    I understand that you want to place some sanity check, but in the end it's the programmer's job to program it right. The best way to enforce the exhaustive check is to *write goddamn unit tests*. – Alexey S. Larionov Nov 16 '21 at 15:47
  • That's not a great attitude. In the Rust world we've tried to eschew that old school "just don't write buggy code, duh" mentality. It's nice to have the compiler catch mistakes early. Unit tests often require one to duplicate logic, encoding the same requirements in two different places. – John Kugelman Nov 16 '21 at 17:44
  • 2
    I don't know if it's possible to catch this at compile time but it's a fair question to ask. If the answer is "it can't be done" we can say that without being dismissive, yeah? – John Kugelman Nov 16 '21 at 17:49

1 Answers1

2

The fact that a one-to-one correspondence exists between two enums suggests that you should really only be using one enum behind the scenes. Here is an example of a data model that I think suits your needs. This is naturally exhaustive because there is only a single enum to begin with.

use core::fmt::{Debug, Error, Formatter};

enum NucleicAcid {
    Dna,
    Rna,
}

enum Nucleotide {
    A,
    C,
    G,
    TU,
}

struct BasePair {
    nucleic_acid: NucleicAcid,
    nucleotide: Nucleotide,
}

impl BasePair {
    fn new(nucleic_acid: NucleicAcid, nucleotide: Nucleotide) -> Self {
        Self {
            nucleic_acid,
            nucleotide,
        }
    }
}

impl Debug for BasePair {
    fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error> {
        use NucleicAcid::*;
        use Nucleotide::*;

        let BasePair {
            nucleic_acid,
            nucleotide,
        } = self;
        let nucleic_acid_str = match nucleic_acid {
            Dna => "dna",
            Rna => "rna",
        };
        let nucleotide_str = match nucleotide {
            A => "A",
            C => "C",
            G => "G",
            TU => match nucleic_acid {
                Dna => "T",
                Rna => "U",
            },
        };
        f.write_fmt(format_args!("{}:{}", nucleic_acid_str, nucleotide_str))
    }
}

fn main() {
    let bp1 = BasePair::new(NucleicAcid::Dna, Nucleotide::TU);
    let bp2 = BasePair::new(NucleicAcid::Rna, Nucleotide::C);
    
    println!("{:?}, {:?}", bp1, bp2);
    // dna:T, rna:C
}
BallpointBen
  • 9,406
  • 1
  • 32
  • 62