Questions tagged [intrinsics]

Intrinsics are functions used in compiled languages to trigger the execution specific processor instructions, typically those outside the scope of the compiled language itself.

Intrinsic functions are pseudo-functions used by compilers to represent functionality that is outside the current scope of the language; often times, they may later be incorporated into a language. Some examples are simd and atomic instructions. The compiler has knowledge of the operations of the intrinsics and is able to optimize register use to take advantage of them.

A compiler library usually has actual implementations of the functions, which are used if a lower class CPU (or completely different) is detected at run-time or compile time.

Compiler intrinsics are very similar to inline-assembly. Inline assembler has notations to denote permissible input and output registers as well as clobber values; unless the compiler implicitly parses the inline assembly. With a compiler intrinsic, the register use is already built into the compiler and a developer doesn't need to know as many low level details; although it is often helpful to have some low level assembler knowledge to guide profiling and optimization.

Related tags:

1314 questions
-1
votes
1 answer

C++ SSE Intrinsics: Storing results in variables

I have trouble understanding the usage of SSE intrinsics to store results of some SIMD calculation back into "normal variables". For example the _mm_store_ps intrinsic is described in the "Intel Intrinsics Guide" as follows: void _mm_store_ps…
n1198943
  • 13
  • 2
-1
votes
1 answer

How to use vindex and scale with _mm_i32gather_epi32 to gather elements?

Intel's Intrinsic Guide says: __m128i _mm_i32gather_epi32 (int const* base_addr, __m128i vindex, const int scale) And: Description Gather 32-bit integers from memory using 32-bit indices. 32-bit elements are loaded from addresses starting at…
jww
  • 97,681
  • 90
  • 411
  • 885
-1
votes
1 answer

bitwise operator from different type with opencv

I frequently encounter that issue but I don't really know a proper way to fix it. I just would like some advise to do it the regarding to the processing time. I am using opencv and I want to realize that operation: map |= mask & mu(0); map is a…
John_Sharp1318
  • 939
  • 8
  • 19
-1
votes
1 answer

matrix optimization - segmentation fault when using intrinsics and loop unrolling

I'm currently trying to optimize matrix operations with intrinsics and loop unrolling. There was segmentation fault which I couldn't figure out. Here is the code I made change: const int UNROLL = 4; void outer_product(matrix *vec1, matrix *vec2,…
-1
votes
1 answer

_mm256_cvtss_f32 isn't recognized by XCode

I'm trying to use SIMD intrinsics with a C program in XCode 7.1. (Note, I am writing a C99 program and not a C++ program). I've included immintrin.h, and I've written several functions using intrinsic commands that function very well. I'm now…
user24205
  • 481
  • 5
  • 15
-1
votes
1 answer

cpu requirements for `x86intrin.h`?

Hi there I thought the minimum CPU requirements to run x86intrin.h is an intel 3th gen processor. however when i run this code _rdseed64_step(&temp2); i get the following error error: inlining failed in call to always_inline 'int…
albusSimba
  • 441
  • 4
  • 14
-1
votes
3 answers

How to measure the elapsead time below nanosecond for x86?

I have searched and used many approaches for measuring the elapsed time. there are many questions for this purpose. For example, this question is very good but when you need an accurate time recorder I couldn't find a good method. For this, I want…
Amiri
  • 2,417
  • 1
  • 15
  • 42
-1
votes
1 answer

unknown segmentation fault issue

I have a segmentation fault problem that is driving me crazy. this is the code : for (k = 0; k < range; k=k+4) { int k1,k2,kfactor,k1factor,k2factor; __m128 bfly0_rv, bfly1_rv, bfly2_rv, bfly3_rv; …
A.nechi
  • 521
  • 1
  • 5
  • 15
-1
votes
1 answer

Getting an understandable error using __m512 intel intrinsic

Hi i'm trying to use intel intrinsics. So I've made some macros that contains the intrinsics like this: #define __M512_MM_SET_PS(dest, e15, e14, e13, e12, e11, e10, e9, e8, e7, e6, e5, e4, e3, e2, e1, e0)\ { …
A.nechi
  • 521
  • 1
  • 5
  • 15
-1
votes
2 answers

Is the flag -ffixed- always bugged in GCC?

I have 3 versions of gcc installed on my linux 64 bit machine gcc 4.9.2 gcc 5.3.0 gcc 6 [ a build from an svn snapshot ] all 3 compilers give me the same error when I try to explcitly reserve xmm registers with -ffixed-xmm0 -ffixed-xmm1…
xelp
  • 47
  • 4
-1
votes
1 answer

_mm_storeu_si128 cost too much time?

This is a C fuction, which gets weight values of src and stores them into dst. static int _medium_c( DCTELEM * src, int index, int *dst ) { int i; //get weighted value for( i = 0; i < 16; i++ ) { unsigned int threshold1 =…
Alex LEE
  • 1
  • 1
-1
votes
1 answer

Intel Intrinsics: combine every other word from 2 registers

I have two __m128i registers, let's call them srcA and srcB. From that I want to get an __m128i register, let's say dst, which contains the following words (pseudo-code assuming srcA, srcB and dst are word pointers): dst[0] = srcA[0]; dst[1] =…
Warpin
  • 6,971
  • 12
  • 51
  • 77
-1
votes
2 answers

C++: When should i use _disable() _enable()

Visual Studio allows instructs to clear the processors interrupt flag via _disable or _enable (see link). When, it is recommended to use such tools. Especally, in view of performance. https://msdn.microsoft.com/en-us/library/tzkfha43.aspx
user1235183
  • 3,002
  • 1
  • 27
  • 66
-1
votes
2 answers

SSE intrinsics: How to store values to the register?

I'm very new to SSE intrinsics and have a small problem. I need help in loading integer values to the __m128i Here is what I already have: __m128i a = _mm_set_epi16( 1, 1, 2, 2, 3, 3, 4, 4 ); __m128i b = _mm_set_epi16( 5, 5, 6, 6, 7, 7, 8, 8…
user1235183
  • 3,002
  • 1
  • 27
  • 66
-1
votes
1 answer

Where can I find a reference or a book to understand the terms used in Intel intrinsics descriptions?

I'm studying to get a master's degree in CS and want (and need) to learn to use Intel intrinsics. However, the new intrinsics reference page, while being awesome per se, is full of specific lingo, which, as far as I understand, is related to…
iksemyonov
  • 4,106
  • 1
  • 22
  • 42
1 2 3
87
88