How to compile a project which requires SSE2 on MacBook with M1 chip?

Question

I need to install a software which requires SSE2 on my macbook air with M1 chip (os Monterey).

When I am trying to compile the project I receive the following error:

/libRootFftwWrapper/vectorclass/vectorclass.h:38:4: error: Please compile for the SSE2 instruction set or higher
  #error Please compile for the SSE2 instruction set or higher
   ^

and the error message links to the following lines in the code:

#include "instrset.h"        // Select supported instruction set

#if INSTRSET < 2             // SSE2 required
  #error Please compile for the SSE2 instruction set or higher
#else

I understand that only Intel chips equipped with SSE2, but is there any kind of a translator which can help me to build this project?

Update: problem is solved. Solution is in the answer section.

If the project doesn’t support the arm64 architecture that the M1 chip uses, then you generally have to put in a bunch of work to make it compatible. But you might be able to force the project to compile for the x86_64 architecture instead—the Rosetta 2 system in macOS can run x86_64 binaries with few or no problems. — bdesham, Feb 19 '22 at 00:59
If most of the SIMD usage is with Agner Fog's `vectorclass.h`, it *might* be possible to substitute an ARM vector library without a huge amount of work, just some search/replace or even using the same names like `Vec4f` and overloads. Depending on how much of it is pure vertical SIMD, without a lot of shuffles that might be harder to port from SSE2 to NEON. — Peter Cordes, Feb 19 '22 at 01:33
@bdesham, I managed to compile the project with rosetta 2. thank you for the suggestion. — Alisa Nozdrina, Feb 19 '22 at 22:42

score 5 · Answer 1 · answered Feb 21 '22 at 22:21

Alisa's solution may not be optimal for some people, so here is an alternative.

Rosetta 2 is basically an emulator; it takes compiled x86 machine code and runs it on ARM. I don't have an M1 CPU, but by all accounts it does a very good job of this.

That said, it can often be better to compile code directly to target the Arm CPU instead of relying on Rosetta. The compiler generally has more information about how the code works than an emulator which has to operate after all that additional context has been thrown away, so it can sometimes optimize code more effectively.

The problem Alisa is running into is that SSE intrinsics aren't designed to be portable, they're designed to let people achieve better performance by writing code which is very tightly coupled with the underlying architecture.

There are a couple projects which allow you to compile your SSE code using NEON, which you can think of as Arm's version of SSE, by providing alternate implementations of the SSE API. The two most popular are probably SSE2NEON and SIMD Everywhere (SIMDe) (full disclosure: I am the lead developer for the latter).

SSE2NEON simply implements SSE using NEON. SIMDe provides many implementations, including NEON, AltiVec/VSX (POWER), WebAssembly SIMD, z/Architecture, etc., as well as portable fallbacks which work everywhere.

Both projects work basically the same: instead of including <xmmintrin.h> (or some other x86-specific header, it depends on which ISA you want to use) you include either SSE2NEON or SIMDe. You then add any relevant compiler flags to set the target (e.g., -march=armv8-a+simd), and you're good to go.

If performance isn't a major concern, Rosetta 2 is probably the easiest option. Otherwise you may want to look into SSE2NEON or SIMDe.

Another consideration is whether you just want a quick fix or eventually want to port the code over to Arm... Rosetta 2 is not intended to be a long-term solution, but rather a stop-gap to allow existing code to continue working while people port their code. SSE2NEON and SIMDe both make it possible to mix x86 and Arm SIMD code in the same executable, so you can port your code gradually over time instead of having to flip one big switch to transition from x86 to Arm.

score 2 · Accepted Answer · answered Feb 19 '22 at 22:57

I managed to compile the project by using rosetta 2, as it was suggested in the comments below. To install rosetta I used the following command:

$ softwareupdate --install-rosetta

Then I installed Homebrew, clang and cmake for x86_64 arch by using:

$ arch -x86_64 /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
$ arch -x86_64 /usr/local/bin/brew install llvm
$ arch -x86_64 /usr/local/bin/brew install cmake

I also had to re-tap Homebrew by using:

$ rm -rf "/usr/local/Homebrew/Library/Taps/homebrew/homebrew-core"
$ arch -x86_64 /usr/local/bin/brew tap homebrew/core

as it was suggested by brew doctor.

finally, the project was compiled after removing previously generated CMakeCache:

$ make clean
$ arch -x86_64 /usr/local/bin/cmake build_dir 
$ make 
$ make install

How to compile a project which requires SSE2 on MacBook with M1 chip?

2 Answers2