Questions tagged [intrinsics]

Intrinsics are functions used in compiled languages to trigger the execution specific processor instructions, typically those outside the scope of the compiled language itself.

Intrinsic functions are pseudo-functions used by compilers to represent functionality that is outside the current scope of the language; often times, they may later be incorporated into a language. Some examples are simd and atomic instructions. The compiler has knowledge of the operations of the intrinsics and is able to optimize register use to take advantage of them.

A compiler library usually has actual implementations of the functions, which are used if a lower class CPU (or completely different) is detected at run-time or compile time.

Compiler intrinsics are very similar to inline-assembly. Inline assembler has notations to denote permissible input and output registers as well as clobber values; unless the compiler implicitly parses the inline assembly. With a compiler intrinsic, the register use is already built into the compiler and a developer doesn't need to know as many low level details; although it is often helpful to have some low level assembler knowledge to guide profiling and optimization.

Related tags: simd atomic inline-assembly

1314 questions

votes

1 answer

When the compiler reorders AVX instructions on Sandy, does it affect performance?

Please do not say this is premature microoptimization. I want to understand, as much as it is possible given my limited knowledge, how the described SB feature and assembly works, and make sure that my code makes use of this architectural feature.…

c performance optimization intrinsics avx

asked Jan 04 '15 at 20:10

iksemyonov

4,106
1
22
42

votes

1 answer

Funnel shift - what is it?

When reading through CUDA 5.0 Programming Guide I stumbled on a feature called "Funnel shift" which is present in 3.5 compute-capable device, but not 3.0. It contains an annotation "see reference manual", but when I search for the "funnel shift"…

cuda intrinsics ptx

asked Oct 07 '12 at 08:00

CygnusX1

20,968
5
65
109

votes

3 answers

What is the difference between Java intrinsic and native methods?

Java intrinsic functions are mentioned in various places (e.g. here). My understanding is that these are methods that handled with special native code. This seems similar to a JNI method which is also a block of native code. What is the difference?

java native intrinsics

asked Jun 21 '19 at 07:28

rghome

8,529
8
43
62

votes

3 answers

How to use the multiply and accumulate intrinsics in ARM Cortex-a8?

how to use the Multiply-Accumulate intrinsics provided by GCC? float32x4_t vmlaq_f32 (float32x4_t , float32x4_t , float32x4_t); Can anyone explain what three parameters I have to pass to this function. I mean the Source and destination registers…

c arm simd intrinsics neon

asked Jul 13 '10 at 18:56

HaggarTheHorrible

7,083
20
70
81

votes

4 answers

What's the proper way to use different versions of SSE intrinsics in GCC?

I will ask my question by giving an example. Now I have a function called do_something(). It has three versions: do_something(), do_something_sse3(), and do_something_sse4(). When my program runs, it will detect the CPU feature (see if it supports…

c gcc sse intrinsics

asked Mar 23 '13 at 08:54

shengbinmeng

1,517
2
12
22

votes

2 answers

Scatter intrinsics in AVX

I can't find them in the Intel Intrinsic Guide v2.7. Do you know if AVX or AVX2 instruction sets support them?

intrinsics avx avx2

asked Dec 24 '12 at 11:16

elmattic

12,046
5
43
79

votes

2 answers

Constexpr and SSE intrinsics

Most C++ compilers support SIMD(SSE/AVX) instructions with intrisics like _mm_cmpeq_epi32 My problem with this is that this function is not marked as constexpr, although "semantically" there is no reason for this function to not be constexpr since…

c++ sse simd constexpr intrinsics

asked Aug 16 '18 at 14:59

NoSenseEtAl

28,205
28
128
277

votes

1 answer

Intel Intrinsics guide - Latency and Throughput

Can somebody explain the Latency and the Throughput values given in the Intel Intrinsic Guide? Have I understood it correctly that the latency is the amount of time units an instruction takes to run, and the throughput is the number of instructions…

performance x86 intel sse intrinsics

asked Oct 23 '16 at 13:05

Philipp Neufeld

1,053
10
23

votes

2 answers

How do you use the pause assembly instruction in 64-bit C++ code?

Since inlined assembly is not supported by VC++ 2010 in 64-bit code, how do I get a pause x86-64 instruction into my code? There does not appear to be an intrinsic for this like there is for many other common assembly instructions (e.g., __rdtsc(),…

c++ visual-c++ x86 intrinsics visual-c++-2010

asked Apr 29 '11 at 14:42

Michael Goldshteyn

71,784
24
131
181

votes

1 answer

Why are there 128bit load functions for SSE?

I'm poking around in somebody else's code and currently trying to figure out why _mm_load_si128 exists. Essentially, I tried replacing _ra = _mm_load_si128(reinterpret_cast<__m128i*>(&cd->data[idx])); with _ra =…

c++ x86 sse simd intrinsics

asked May 27 '17 at 13:10

user81993

6,167
6
32
64

votes

3 answers

Produce loops without cmp instruction in GCC

I have a number of tight loops I'm trying to optimize with GCC and intrinsics. Consider for example the following function. void triad(float *x, float *y, float *z, const int n) { float k = 3.14159f; int i; __m256 k4 =…

c gcc optimization assembly intrinsics

asked Sep 18 '14 at 20:17

Z boson

32,619
11
123
226

votes

4 answers

Arm Neon Intrinsics vs hand assembly

https://web.archive.org/web/20170227190422/http://hilbert-space.de/?p=22 On this site which is quite dated it shows that hand written asm would give a much greater improvement then the intrinsics. I am wondering if this is the current truth even now…

arm neon intrinsics

asked Mar 22 '12 at 18:48

George Host

votes

3 answers

SSE instruction set not enabled

I am getting trouble with this error: "SSE instruction set not enabled". How I can figure this out? I have ACER i7, Ubuntu 11.10, please any one can help me? Any help will be appreciated! Also running: sudo cat /proc/cpuinfo | grep…

c++ intrinsics sse2 sse3

asked Feb 04 '12 at 21:06

ksolid

votes

5 answers

128-bit division intrinsic in Visual C++

I'm wondering if there really is no 128-bit division intrinsic function in Visual C++? There is a 64x64=128 bit multiplication intrinsic function called _umul128(), which nicely matches the MUL x64 assembler instruction. Naturally, I assumed there…

visual-c++ intrinsics integer-division 128-bit

asked Dec 09 '11 at 23:50

cxxl

4,939
3
31
52

votes

1 answer

gcc, simd intrinsics and fast-math concepts

Hi all :) I'm trying to get a hang on a few concepts regarding floating point, SIMD/math intrinsics and the fast-math flag for gcc. More specifically, I'm using MinGW with gcc v4.5.0 on a x86 cpu. I've searched around for a while now, and that's…

gcc simd intrinsics fast-math

asked Feb 11 '11 at 07:13

rocket441

Prev 1 2

…

87 88 Next