Questions tagged [xeon-phi]

a co-processor/accelerator from Intel

Intel Many Integrated Core Architecture or Intel MIC (pronounced Mike) is a multiprocessor computer architecture developed by Intel incorporating earlier work on the Larrabee many core architecture, the Teraflops Research Chip multicore chip research project, and the Intel Single-chip Cloud Computer multicore microprocessor.

188 questions
5
votes
1 answer

Get specific model of a xeon phi

I'm trying to find the exact model of a Xeon Phi coprocessor i'm using. I run micpinfo and this is what i get ***************************/opt/intel/mic/bin/micinfo*************************** MicInfo Utility Log Created Fri Jan 10 13:09:40…
user1730250
  • 582
  • 2
  • 9
  • 26
4
votes
1 answer

What is _kmp_fork_barrier and how to see if there is load imbalance?

I'm using Intel VTune Amplifier to see how my parallel application scales. Notice I don't use any explicit lock mechanism It scales pretty well on my 4-cores laptop (considering that there are portions of the algorithm that can't be…
4
votes
1 answer

invalid 'asm': nested assembly dialect alternatives

I'm trying to write some inline assembly code with KNC instructions for Xeon Phi platform, using the k1om-mpss-linux-gcc compiler. I want to use a mask register into my code in order to vectorize my computation. Here it is my code: #include…
Hamid_UMB
  • 317
  • 4
  • 16
4
votes
1 answer

How to pass struct to offload in Xeon Phi

I have a struct A with a few int and one int * member. How can I use this in offload? I probably can't do #pragma offload target(mic: 0) inout(A){}..., but what about #pragma offload target(mic: 0) in(A->firstInt, A->secondInt)…
jabk
  • 1,388
  • 4
  • 25
  • 43
4
votes
1 answer

OpenMP 4.0 - GCC 5.2.0 - Overlap device and host task execution

I am trying to test a very simple program that uses gcc 5 offload capabilities through OpenMP 4.0 directives. My goal is to write a two independent tasks program with one task being executed on an accelerator (i.e. Intel MIC emulator) and another…
4
votes
0 answers

How to monitor the utilization of cores on Xeon Phi at 10Hz?

I've been trying to measure/monitor the utilization of all those 60 cores on Xeon Phi (Knights Corner, in-order processors) at a relatively high frequency, say, at least every 0.1s which yields to 10Hz. I tried the latest PAPI library. But it only…
thierry
  • 217
  • 2
  • 12
4
votes
1 answer

Vectorizing/optimising loop with unaligned data access for wide registers (Xeon Phi in particular)

This is my first experience asking questions to the Stackoverflow community. Sorry if my question does not fit the forum's style/size - will improve with experience. I am trying to vectorize a loop in C++ using Intel Compiler 14.0.1 to make better…
4
votes
3 answers

Offload daemon on xeon phi 5110p

I am aware that the Intel Xeon phi coprocessor SE10X has 61 cores and it is suggested to use only 60 cores since 1 core is used for the offload daemon. Also, since intel xeon phi coprocessor 5110P has 60 cores, is it suggested to use 59 cores?
hrs
  • 487
  • 5
  • 18
3
votes
1 answer

Differences between current gen Xeon Processors

What's the actual differences between Xeon W series, Bronze, Silver, Gold and Platinum series? With earlier versions of Xeons, The E3 were single socket CPU's. whereas E5's could be used in motherboards with two sockets. The E7's were quad sockets…
kris
  • 153
  • 1
  • 7
3
votes
0 answers

Intel MKL/Xeon Phi Offload Runtime Issue - Auto Offload not working

I have set up my Xeon phi 3120A in Windows 10 Pro, with MPSS 3.8.4 and Parallel XE 2017 (Initial Release). I have chosen this Parallel XE as this was the last supported XE for the x100 series. I have installed the MKL version that is packaged with…
RashKel
  • 71
  • 6
3
votes
1 answer

Required time to offload a function to Intel Xeon Phi

Is there a predefined time that is required for offload call to transfer the data(parameters) of a function from host to Intel MIC(Xeon Phi coprocessor 3120 series)? Specifically I do offload call ("#pragma offload target(mic)") for a function that…
wasilis
  • 115
  • 1
  • 8
3
votes
1 answer

Latest OpenCL Driver for Xeon Phi

I am struggling to get the latest OpenCL driver for Intel Xeon Phi. I have a Knights Corner (KNC) and I only find the deprecated OpenCL Runtime 14.2 (from 2014?). Where can I find the non deprecated release? Website:…
user3819881
  • 377
  • 3
  • 13
3
votes
1 answer

_mm512_storenr_pd and _mm512_storenrngo_pd

What is the difference between _mm512_storenrngo_pd and _mm512_storenr_pd? _mm512_storenr_pd(void * mt, __m512d v): Stores packed double-precision (64-bit) floating-point elements from v to memory address mt with a no-read hint to the…
boraas
  • 929
  • 1
  • 10
  • 24
3
votes
2 answers

Convert 16 bit mask (__mmask16) to __m128i control byte mask on KNL (Xeon Phi 7210)

I wish to performance a conversion between __mmask16 and __m128i. However, as posted at https://stackoverflow.com/a/32247779/6889542 /* convert 16 bit mask to __m128i control byte mask…
veritas
  • 196
  • 13
3
votes
1 answer

Loop sequence in OpenMP Collapse performance advise

I found Intel's performance suggestion on Xeon Phi on Collapse clause in OpenMP. #pragma omp parallel for collapse(2) for (i = 0; i < imax; i++) { for (j = 0; j < jmax; j++) a[ j + jmax*i] = 1.; } Modified example for better…
Francium
  • 81
  • 7
1
2
3
12 13