I wrote this function for an Advent of Code problem:
pub fn best_window_variable(grid: &Vec<Vec<i32>>) -> (usize, usize, usize) {
let mut best_power = i32::MIN;
let mut best_window = (0usize, 0usize, 0usize);
for y in 0..grid.len() {
for x in 0..grid[y].len() {
let space = std::cmp::min(grid.len() - y, grid[y].len() - x);
let mut window_power = 0i32;
for z in 1..=space {
for dy in 0..z {
window_power += grid[y + dy][x + z - 1];
}
for dx in 0..(z - 1) {
window_power += grid[y + z - 1][x + dx];
}
if window_power > best_power {
best_power = window_power;
best_window = (x + 1, y + 1, z);
}
}
}
}
best_window
}
The optimised (release) build is 50x faster than the unoptimised (debug) build. I would like to understand what optimisations are providing this improvement.
I am learning x86-64 assembly and trying to read the assembly output using cargo-asm for both releases to understand the difference.
Is there an easier way to get to the answer?
Does LLVM log which optimisations are applied? If so, would it be possible to gauge which had the most impact on runtime?
Would it be easier to use a tool like Ghidra to analyse the compiled code?