I have a crate that generate code with macros using quote
and proc_macro2::TokenStream
.
When the size of the generated code increase a little, the compilation time explodes exponentially, and the disk usage of my SSD stays at 100% for a few minutes.
What advice should I follow to keep the compilation time reasonable?
For example, when the code generates one "element", it takes 15 seconds, but for 5 elements, it takes 15 MINUTES, 60 times more.
During most of the 15 minutes, it only uses the CPU (a lot). But at the end, for a few minutes, the disk usage is 100%.
What could explain the unreasonable disk usage by rustc when the code produce by the macro grows?
It is reproducible on two different computers, one with Windows 11, the other Ubuntu 22. Both with an SSD, and largely enough ram (so it can't be a swap problem).
I've set-up two branches to reproduce: macro1
and macro2
in https://github.com/tdelmas/floats
The diff between the two: https://github.com/tdelmas/floats/compare/macro1...macro2
/!\ DO NOT open the branch macro2
on vscode, between the background build that takes 15 minutes (with 100% disk usage on the end) and rust-analyzer, it will be hard to work.
To reproduce:
(all those steps were done on a terminal, with vscode - and rust-analyzer - closed)
The slow macro is only called in the code that generate test, so we can pre-build the app (and the dependencies) with cargo build
first and then build the app with tests (that needs that macro to be built) with cargo test --no-run
(anyway the tests are fast, less than a second)
# initialisation
git checkout https://github.com/tdelmas/floats
cd floats
Then we can prepare the build for macro1
with:
git switch macro1
# to remove any previous build
cargo clean
# dowload and build deps, the build without the impacted feature
cargo build
# build with the problematic macro
cargo test --no-run
That last command took 14 seconds on my Windows and 15 on my other computer with Linux (Ubuntu). (slow, but reasonable)
To compare with the other branch macro2
:
git switch macro2
# to remove any previous build
cargo clean
# dowload and build deps, the build without the impacted feature
cargo build
# build with the problematic macro
cargo test --no-run
And that last command, that build with the macro that is problematic, with took 15 MINUTES on my Windows and 8 minutes on my Linux.
The code between the two branches should not change in complexity, only in size. If we assume that most of the time is spent in the code where the diff is, then the build should take at most 5 times more, not 60.
Or did I make a catastrophic mistake in my code?
I use in this version:
output.extend(quote! {
#add
#sub
#mul
#div
#rem
});
Where #add
, #sub
, ... are generated before with (simplifying): let add = quote! { ... }
,
but the previous version had one call to output.extend
per var, and was as slow (I changed it to limit the number of call to output.extend
, but that had apparently no impact.).
I can probably avoid the problem by generating less code with macros and adding methods on my objects to run more code at runtime is I don't understand why the build time explode, but I would really prefer not to and understand what's happening here.
More context:
The problematic macro is called here: https://github.com/tdelmas/floats/blob/macro1/typed_floats/src/lib.rs#L106
The call to #add
and 4 others is in a double loop (of 12 elements each, from get_specifications
).
They all generate a similar code by calling test_op_self_rhs
(which calls test_op_checks
). None of those function contains any loop. They just concatenate quotes!
.
- First loop: https://github.com/tdelmas/floats/blob/macro1/typed_floats_macros/src/lib.rs#L309
- Second loop: https://github.com/tdelmas/floats/blob/macro1/typed_floats_macros/src/lib.rs#L343
These two loops generate all combinations of NonNaN
,NonNaNFinite
, NonZeroNonNaN
, NonZeroNonNaNFinite
, Positive
,PositiveFinite
, StrictlyPositive
, StrictlyPositiveFinite
, Negative
,NegativeFinite
, StrictlyNegative
, StrictlyNegativeFinite
to generate the tests that:
- verify that the output types of the operations (like
+
) is not too strict - verify that the output type is as strict as possible
The functions XXX_result
like add_result
take two times as parameters, and the list of all types, and determine the resulting type if we add two number of those types. (if the result of that function is None
it means that it may be NaN
thus none of those types fit and f64
must be the return type for that operation)
(Cf https://crates.io/crates/typed_floats)
Crosspost: https://users.rust-lang.org/t/rust-macro-extremely-slow-and-with-high-disk-usage/96281