0

TL;DR

I'm trying to write a macro that will do the following transformation:

magic_formatter!(["_{}", "{}_", "_{}_"], "foo") == 
    [format!("_{}", "foo"), 
     format!("{}_", "foo"), 
     format!("_{}_", "foo")]

(a solution that will give ["_foo", "foo_", "_foo_"] and works for varargs is also welcome)

Full story:

I'm writing a parser, and many of it's tests do stuff like this:

    let ident = identifier().parse("foo ").unwrap();
    assert_eq!(ident, Syntax::ident("foo"));
    let ident = identifier().parse(" foo").unwrap();
    assert_eq!(ident, Syntax::ident("foo"));
    let ident = identifier().parse(" foo ").unwrap();
    assert_eq!(ident, Syntax::ident("foo"));

so I tried to reduce repetition by doing this:

    for f in [" {}", "{} ", " {} "] {
        let inp = format!(f, "foo");
        let ident = identifier().parse(inp).unwrap();
        assert_eq!(ident, Syntax::ident("foo"));
    }

which of course doesn't compile.

However, it seems to me that there isn't really any unknown information preventing from the whole array to be generated at compile time, so I searched the webz, hoping that this has been solved somewhere already, but my google-fu can't seem to find anything that just does what I want.

So I thought I'd get my hands dirty and write an actually useful rust macro for the first time(!).

I read the macro chapter of Rust by Example, and failed for a while. Then I tried reading the actual reference which I feel that got me a few steps further but I still couldn't get it right. Then I really got into it and found this cool explanation and thought that I actually had it this time, but I still can't seem to get my macro to work properly and compile at the same time.

my latest attempt looks is this:

    macro_rules! map_fmt {
    (@accum () -> $($body:tt),*) => { map_fmt!(@as_expr [$($body),*]) };

    (@accum ([$f:literal, $($fs:literal),*], $args:tt) -> $($body:tt),*) => {
        map_fmt!(@accum ([$($fs),*], $args) -> (format!($f, $args) $($body),*))
    };

    (@as_expr $e:expr) => { $e };

    ([$f:literal, $($fs:literal),*], $args:expr) => {
        map_fmt!(@accum ([$f, $($fs),*], $args) -> ())
    };
    }

I'll appreciate if someone could help me understand what is my macro missing? and how to fix it, if even possible? and if not is there some other technique I could/should use to reduce the repetition in my tests?

Edit:

this is the final solution I'm using, which is the correct answer provided by @finomnis, which I slightly modified to support variadic arguments in the format! expression

macro_rules! map_fmt {
    (@accum ([$f:literal], $($args:tt),*) -> ($($body:tt)*)) => { [$($body)* format!($f, $($args),*)] };

    (@accum ([$f:literal, $($fs:literal),*], $($args:tt),*) -> ($($body:tt)*)) => {
            map_fmt!(@accum ([$($fs),*], $($args),*) -> ($($body)* format!($f, $($args),*),))
    };

    ([$f:literal, $($fs:literal),*], $($args:expr),*) => {
            map_fmt!(@accum ([$f, $($fs),*], $($args),*) -> ())
    };
}
NivPgir
  • 63
  • 4
  • Nitpick: the book you've read is old and oudated, [here](https://veykril.github.io/tlborm/decl-macros/patterns/push-down-acc.html) is an updated version. – Chayim Friedman Jul 17 '22 at 00:53
  • Why do you need the `$body`? – Chayim Friedman Jul 17 '22 at 01:00
  • 1
    If all of your tests involve whitespace like this you can do this with `str.replace` instead of `format!`, or a purpose-built function returning `impl Iterator`. – cdhowie Jul 17 '22 at 05:27
  • @cdhowie it solves the code duplication perspective, but happens at runtime, which isn't too bad but the reason I'm asking about macros here is because I'm trying/hoping this can be done at compile time. – NivPgir Jul 17 '22 at 08:10
  • @ChayimFriedman as a place to collect intermediate output? but maybe I'm misunderstanding something? after all I can't seem to get it to work – NivPgir Jul 17 '22 at 08:20
  • @ChayimFriedman I think it's inspired by https://veykril.github.io/tlborm/decl-macros/patterns/push-down-acc.html – Finomnis Jul 17 '22 at 08:59
  • The main problem is that `[$f:literal, $($fs:literal),*]` doesn't match `[""]`, because the comma is missing (should be `["", ]`) – Finomnis Jul 17 '22 at 09:32
  • 1
    Be aware that `format` is *not* a compile time replacement either. – Finomnis Jul 17 '22 at 09:35

1 Answers1

1

format!() doesn't work, because it generates the code at compiletime and therefore needs an actual string literal formatter.

str::replace(), however, works:

fn main() {
    for f in [" {}", "{} ", " {} "] {
        let inp = f.replace("{}", "foo");
        println!("{:?}", inp);
    }
}
" foo"
"foo "
" foo "

I don't think there is any reason why doing this at runtime is a problem, especially as your format!() call in the macro is also a runtime replacement, but nonetheless I think this is an interesting challenge to learn more about macros.

There are a couple of problems with your macro.

For one, the () case should be ([], $_:tt) instead.

But the main problem with your macro is that [$f:literal, $($fs:literal),*] does not match [""] (the case where only one literal is left) because it doesn't match the required comma. This one would match: ["",]. This can be solved by converting the $(),* into $(),+ (meaning, they have to carry at least one element) and then replacing the [] (no elements left) case with [$f:literal] (one element left). This then handles the special case where only one element is left and the comma doesn't match.

The way you select your intermediate results has minor bugs in several places. At some places, you forgot the () around it, and the arguments may be in the wrong order. Further, it's better to transport them as $(tt)* instead of $(tt),*, as the tt contains the comma already.

Your $as_expr case doesn't serve much purpose according to the newer macro book, so I would remove it.

This is how your code could look like after fixing all those things:

macro_rules! map_fmt {
    (@accum ([$f:literal], $args:tt) -> ($($body:tt)*)) => {
        [$($body)* format!($f, $args)]
    };

    (@accum ([$f:literal, $($fs:literal),*], $args:tt) -> ($($body:tt)*)) => {
        map_fmt!(@accum ([$($fs),*], $args) -> ($($body)* format!($f, $args),))
    };

    ([$f:literal, $($fs:literal),*], $args:expr) => {
        map_fmt!(@accum ([$f, $($fs),*], $args) -> ())
    };
}

fn main() {
    let fmt = map_fmt!(["_{}", "{}_", "_{}_"], "foo");
    println!("{:?}", fmt);
}
["_foo", "foo_", "_foo_"]

However, if you use cargo expand to print what the macro resolves to, this is what you get:

#![feature(prelude_import)]
#[prelude_import]
use std::prelude::rust_2021::*;
#[macro_use]
extern crate std;
fn main() {
    let fmt = [
        {
            let res = ::alloc::fmt::format(::core::fmt::Arguments::new_v1(
                &["_"],
                &[::core::fmt::ArgumentV1::new_display(&"foo")],
            ));
            res
        },
        {
            let res = ::alloc::fmt::format(::core::fmt::Arguments::new_v1(
                &["", "_"],
                &[::core::fmt::ArgumentV1::new_display(&"foo")],
            ));
            res
        },
        {
            let res = ::alloc::fmt::format(::core::fmt::Arguments::new_v1(
                &["_", "_"],
                &[::core::fmt::ArgumentV1::new_display(&"foo")],
            ));
            res
        },
    ];
    {
        ::std::io::_print(::core::fmt::Arguments::new_v1(
            &["", "\n"],
            &[::core::fmt::ArgumentV1::new_debug(&fmt)],
        ));
    };
}

What you can clearly see here is that the format! is still a runtime call. So I don't think that the macro actually creates any kind of speedup.

You could fix that with the const_format crate:

macro_rules! map_fmt {
    (@accum ([$f:literal], $args:tt) -> ($($body:tt)*)) => {
        [$($body)* ::const_format::formatcp!($f, $args)]
    };

    (@accum ([$f:literal, $($fs:literal),*], $args:tt) -> ($($body:tt)*)) => {
        map_fmt!(@accum ([$($fs),*], $args) -> ($($body)* ::const_format::formatcp!($f, $args),))
    };

    ([$f:literal, $($fs:literal),*], $args:expr) => {{
        map_fmt!(@accum ([$f, $($fs),*], $args) -> ())
    }};
}

fn main() {
    let fmt = map_fmt!(["_{}", "{}_", "_{}_"], "foo");
    println!("{:?}", fmt);

    fn print_type_of<T>(_: &T) {
        println!("{}", std::any::type_name::<T>())
    }
    print_type_of(&fmt);
}
["_foo", "foo_", "_foo_"]
[&str; 3]

You can now see that the type is &'static str, meaning, it is now being formatted at compile time and stored in the binary as a static string.


That all said, I think the entire recursion in the macro is quite pointless. It seems like it can be done with a single repetition:

macro_rules! map_fmt {
    ([$($fs:literal),*], $args:expr) => {{
        [$(format!($fs, $args)),*]
    }};
}

fn main() {
    let fmt = map_fmt!(["_{}", "{}_", "_{}_"], "foo");
    println!("{:?}", fmt);
}
["_foo", "foo_", "_foo_"]

If you want to support an arbitrary number of arguments for format!(), then you could do:

macro_rules! map_fmt {
    (@format $f:literal, ($($args:expr),*)) => {
        format!($f, $($args),*)
    };

    ([$($fs:literal),*], $args:tt) => {{
        [$(map_fmt!(@format $fs, $args)),*]
    }};
}

fn main() {
    let fmt = map_fmt!(["_{}_{}", "{}__{}", "{}_{}_"], ("foo", "bar"));
    println!("{:?}", fmt);
}
["_foo_bar", "foo__bar", "foo_bar_"]
Finomnis
  • 18,094
  • 1
  • 20
  • 27
  • thanks for the elaborate answer! since format! is meant to reduce allocations, I assumed it interpolates strings statically when the compiler can deduce it's possible, it's good to learn that's not the case, As for your solution doesn't work with more than one argument to `format!`, do you think there is a way to get that to work? although I didn't explicitly ask for it, it does greatly reduce the usefulness of such a macro. – NivPgir Jul 17 '22 at 10:33
  • I'm sure :) let me tinker with it. – Finomnis Jul 17 '22 at 10:37
  • I actually just got it to work by myself :D: ``` macro_rules! rt_map_fmt { (@accum ([$f:literal], $($args:tt),*) -> ($($body:tt)*)) => { [$($body)* format!($f, $($args),*)] }; (@accum ([$f:literal, $($fs:literal),*], $($args:tt),*) -> ($($body:tt)*)) => { rt_map_fmt!(@accum ([$($fs),*], $($args),*) -> ($($body)* format!($f, $($args),*),)) }; ([$f:literal, $($fs:literal),*], $($args:expr),*) => { rt_map_fmt!(@accum ([$f, $($fs),*], $($args),*) -> ()) }; } ``` – NivPgir Jul 17 '22 at 10:39
  • whoops, I didn't know that comments don't get code formatting, I edited the question to show my changes, @Finomnis, I would appreciate if you could look to see I haven't missed some edge case. – NivPgir Jul 17 '22 at 10:56
  • @NivPgir It doesn't compile a simple example, so no, I don't think it works ... If you "show a solution", the same rules as for the minimal reproducible example apply ... Add a `main` with it, and an output. As I did in my examples. So other people can reproduce your code. – Finomnis Jul 17 '22 at 11:05
  • I don't understand how `format!($f, $($args),*)` is supposed to work. – Finomnis Jul 17 '22 at 11:06
  • 1
    @NivPgir btw, found out the entire recursion is pointless, see the last section of my answer – Finomnis Jul 17 '22 at 11:10
  • `format!($f, $($args),*)`, I think it just transcribes all the `expr`s captured by the `$args` pattern. It's what I'm using right now, and it works :). about the non recursive solution, I actually got to it myself, a few minutes before I saw your solution, but that version can't work with variadic args, because once you add repetition to `$args` it requires that the length of `$fs` and `$args` are the same, which isn't always true, and also might cause unwanted behavior, I haven't checked that though. – NivPgir Jul 17 '22 at 11:19
  • I just realized that that's what you were trying to achieve all along. Ok, i thought you were trying to apply the formatters to multiple strings to get multiple outputs ... Again, that's the reason why you shouldn't just put a patch of code somewhere, please provide context. In the form of a usage example, including a `main`. – Finomnis Jul 17 '22 at 11:51
  • Added a solution for multiple arguments for `format` – Finomnis Jul 17 '22 at 11:52