I'm going to toss in an answer for those coming from GCC. In the GCC world, we do -march=native
and GCC defines macros like -D__SSE2__
, -D__SSE4_1__
, -D__SSE4_2__
, -D__AES__
, -D__AVX__
, -D__BMI__
, etc.
SunCC does not do like GCC does. It does not provide defines like __SSE2__
; nor does it provide the value for -xarch
.
Here are the references to the relevant Sun Studio manuals and the -xarch
options/instructions set choices:
Here's how we are determining what flags we can use, and then converting them to GCC preprocessor macros. Its awful, but I don't know how to get the code generated otherwise.
CC=...
EGREP=...
X86_CPU_FLAGS=$(isainfo -v 2>/dev/null)
SUNCC_510_OR_ABOVE=$("$CXX" -V 2>&1 | "$EGREP" -c "CC: (Sun|Studio) .* (5\.1[0-9]|5\.[2-9]|[6-9]\.)")
SUNCC_511_OR_ABOVE=$("$CXX" -V 2>&1 | "$EGREP" -c "CC: (Sun|Studio) .* (5\.1[1-9]|5\.[2-9]|[6-9]\.)")
SUNCC_512_OR_ABOVE=$("$CXX" -V 2>&1 | "$EGREP" -c "CC: (Sun|Studio) .* (5\.1[2-9]|5\.[2-9]|[6-9]\.)")
SUNCC_513_OR_ABOVE=$("$CXX" -V 2>&1 | "$EGREP" -c "CC: (Sun|Studio) .* (5\.1[3-9]|5\.[2-9]|[6-9]\.)")
SUNCC_XARCH=
if [[ ("$SUNCC_511_OR_ABOVE" -ne "0") ]]; then
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "sse2") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__SSE2__"); SUNCC_XARCH=sse2; fi
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "sse3") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__SSE3__"); SUNCC_XARCH=ssse3; fi
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "ssse3") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__SSSE3__"); SUNCC_XARCH=ssse3; fi
if [[ ("$SUNCC_512_OR_ABOVE" -ne "0") ]]; then
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "sse4.1") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__SSE4_1__"); SUNCC_XARCH=ssse4_1; fi
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "sse4.2") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__SSE4_2__"); SUNCC_XARCH=ssse4_2; fi
if [[ ("$SUNCC_513_OR_ABOVE" -ne "0") ]]; then
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "aes") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__AES__"); SUNCC_XARCH=aes; fi
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "pclmulqdq") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__PCLMUL__"); SUNCC_XARCH=aes; fi
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "rdrand") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__RDRND__"); SUNCC_XARCH=avx_i; fi
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "rdseed") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__RDSEED__"); SUNCC_XARCH=avx_i; fi
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "avx") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__AVX__"); SUNCC_XARCH=avx; fi
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "avx2") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__AVX2__"); SUNCC_XARCH=avx2; fi
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "bmi") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__BMI__"); SUNCC_XARCH=avx2; fi
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "bmi2") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__BMI2__"); SUNCC_XARCH=avx2; fi
if [[ ($(echo -n "$X86_CPU_FLAGS" | "$GREP" -c "adx") -ne "0") ]]; then PLATFORM_CXXFLAGS+=("-D__ADX__"); SUNCC_XARCH=avx2_i; fi
fi
fi
fi
fi
PLATFORM_CXXFLAGS+=("-xarch=$SUNCC_XARCH")
The gyrations above allow us to do things like this (except we need SSE2 though ADX).
#if (_MSC_VER >= 1700) || defined(__RDRND__)
uint64_t val;
if(_rdrand64_step(&val))
{
// Use RDRAND value
}
#endif
Without the gyrations, we continually crash the 12.1 through 12.3 compilers during testing with the inline assembly and intrinsics.
The result of running the script gives us the recipe for CFLAGS
and CXXFLAGS
. Below is from a 4th gen Core i5. XEON's produce different results, as does a 5th gen Core i5. For example, a 5th gen Core i5 will have ADX
and use -xarch=avx_i
.
Pathname: /opt/solstudio12.2/bin/CC (symlinked)
CXXFLAGS: -D__SSE2__ -D__SSE3__ -D__SSSE3__ -xarch=ssse3
/opt/solarisstudio12.3/bin/CC (symlinked)
CXXFLAGS: -D__SSE2__ -D__SSE3__ -D__SSSE3__ -D__SSE4_1__ -D__SSE4_2__ -xarch=ssse4_2
Pathname: /opt/solarisstudio12.4/bin/CC
CXXFLAGS: -D__SSE2__ -D__SSE3__ -D__SSSE3__ -D__SSE4_1__ -D__SSE4_2__ -D__AES__ -D__PCLMUL__ -D__RDRND__ -D__AVX__ -xarch=avx
...