Alisa's solution may not be optimal for some people, so here is an alternative.
Rosetta 2 is basically an emulator; it takes compiled x86 machine code and runs it on ARM. I don't have an M1 CPU, but by all accounts it does a very good job of this.
That said, it can often be better to compile code directly to target the Arm CPU instead of relying on Rosetta. The compiler generally has more information about how the code works than an emulator which has to operate after all that additional context has been thrown away, so it can sometimes optimize code more effectively.
The problem Alisa is running into is that SSE intrinsics aren't designed to be portable, they're designed to let people achieve better performance by writing code which is very tightly coupled with the underlying architecture.
There are a couple projects which allow you to compile your SSE code using NEON, which you can think of as Arm's version of SSE, by providing alternate implementations of the SSE API. The two most popular are probably SSE2NEON and SIMD Everywhere (SIMDe) (full disclosure: I am the lead developer for the latter).
SSE2NEON simply implements SSE using NEON. SIMDe provides many implementations, including NEON, AltiVec/VSX (POWER), WebAssembly SIMD, z/Architecture, etc., as well as portable fallbacks which work everywhere.
Both projects work basically the same: instead of including <xmmintrin.h>
(or some other x86-specific header, it depends on which ISA you want to use) you include either SSE2NEON or SIMDe. You then add any relevant compiler flags to set the target (e.g., -march=armv8-a+simd
), and you're good to go.
If performance isn't a major concern, Rosetta 2 is probably the easiest option. Otherwise you may want to look into SSE2NEON or SIMDe.
Another consideration is whether you just want a quick fix or eventually want to port the code over to Arm... Rosetta 2 is not intended to be a long-term solution, but rather a stop-gap to allow existing code to continue working while people port their code. SSE2NEON and SIMDe both make it possible to mix x86 and Arm SIMD code in the same executable, so you can port your code gradually over time instead of having to flip one big switch to transition from x86 to Arm.