I have a hand-rolled matrix algorithm which finds the largest number of a right lower square of a square matrix (thus when iterating, some parts are 'jumped' over) - stored as a dense matrix. After an update from vs2010 to vs2017 it seems to be much slower - about a 50% slowdown overall. After some investigation, this was located to the inner loop of a function finding absolute largest value. Looking at the assembler output, this seems be due to some extra mov instructions being inserted within the tight loop. Reworking the loop in different ways seems to solve or partly solve the issue. gcc doesn't seem to have this "issue" in comparison.
Simplified examples (fabs
not always neccessary to reproduce):
#include <cmath>
#include <iostream>
int f_slow(double *A, size_t from, size_t w)
{
double biga_absval = *A;
size_t ir = 0,ic=0;
for ( size_t j = 0; j < w; j++ ) {
size_t n = j*w;
for ( ; n < j*w+w; n++ ) {
if ( fabs(A[n]) <= biga_absval ) {
biga_absval = fabs( A[n] );
ir = j;
ic = n;
}
n++;
}
}
std::cout << ir <<ic;
return 0;
}
int f_fast(double *A, size_t from, size_t w)
{
double* biga = A;
double biga_absval = *biga;
double* n_begin = A + from;
double* n_end = A + w;
for (double* A_n = n_begin; A_n < n_end; ++A_n) {
if (fabs(*A_n) > biga_absval) {
biga_absval = fabs(*A_n);
biga = A_n;
}
}
std::cout << biga;
return 0;
}
int f_faster(double *A, size_t from, size_t w)
{
double biga_absval = *A;
size_t ir = 0,ic=0;
for ( size_t j = 0; j < w; j++ ) {
size_t n = j;
for ( ; n < j*w+w; n++ ) {
if ( fabs(A[n]) > biga_absval ) {
biga_absval = fabs( A[n] );
ir = j;
ic = n - j*w;
}
n++;
}
}
std::cout << ir <<ic;
return 0;
}
Please note: examples were created to look at output only (and indexes etc. don't neccessarily make sense):
So my question is: is this just a (known?) optimizer bug (?) or is there some logic behind what in this case seems like a clear optimization miss ?
Using latest stable vs2017 15.9.5
Update: The extra movs I see is before the jump codes - easiest way to find in compiler explorer is to right click on the if
and then "scroll to".