Questions tagged [intel]

For issues related to Intel semiconductor chips and assemblies, Intel architectural features and ISA extensions, and Intel chips micro-architecture.

Intel Corporation is an American multinational semiconductor chip maker corporation headquartered in Santa Clara, California, United States. Intel is the inventor of the x86 processor architecture and makes central processing units, motherboard chipsets, graphic processing units, network interface controllers and much more devices related to communications and computing.

In addition to their hardware offerings Intel also produces a variety of software including compilers, libraries for mathematical computation(Intel MKL), threading(OpenMP, Intel Performance Primatives, Threading Building Blocks), parallel communication(MPI,OFED/True Scale Infiniband Stack) and several other products included in the Intel Parallel Studio toolkit. In addition to these offerings which are widely used in HPC Intel also produces software for datacenter management and is one of the most prolific contributors to the Linux kernel.

This tag should be used for questions about Intel hardware and software.

The x86 and/or x86-64 tags are better choices for questions about assembly programming for the architecture, rather than things like performance tuning specifically for Intel's implementation of x86.


Useful links

Related tags

3529 questions
15
votes
3 answers

Variance in RDTSC overhead

I'm constructing a micro-benchmark to measure performance changes as I experiment with the use of SIMD instruction intrinsics in some primitive image processing operations. However, writing useful micro-benchmarks is difficult, so I'd like to first…
John Bartholomew
  • 6,428
  • 1
  • 30
  • 39
15
votes
1 answer

How many bits there are in a TLB ASID tag for Intel processors? And how to handle 'ASID overflow'?

According to some operating system textbooks, for faster context switches, people add ASID for each process in the TLB tag field, so we don't need to flush the entire TLB in a context switch. I have heard that some ARM processors and MIPS processors…
SaltedFishLZ
  • 157
  • 1
  • 9
15
votes
3 answers

Where is the Write-Combining Buffer located? x86

How is the Write-Combine buffer physically hooked up? I have seen block diagrams illustrating a number of variants: Between L1 and Memory controller Between CPU's store buffer and Memory controller Between CPU's AGUs and/or store units Is it…
Kay
  • 745
  • 5
  • 15
15
votes
4 answers

Is there a list of deprecated x86 instructions?

I'm taking an x86 assembly language programming class and know that certain instructions shouldn't be used anymore -- because they're slow on modern processors; for example, the loop instruction. I haven't been able to find any list of instructions…
LucidDefender
  • 219
  • 2
  • 4
15
votes
1 answer

How can the L1, L2, L3 CPU caches be turned off on modern x86/amd64 chips?

Every modern high-performance CPU of the x86/x86_64 architecture has some hierarchy of data caches: L1, L2, and sometimes L3 (and L4 in very rare cases), and data loaded from/to main RAM is cached in some of them. Sometimes the programmer may want…
osgx
  • 90,338
  • 53
  • 357
  • 513
15
votes
2 answers

Compiler optimization: g++ slower than intel

I recently acquired a computer with dual-boot to code in C++. On windows I use intel C++ compiler and g++ on linux. My programs consist mostly of computation (fixed point iteration algorithm with numerical integration, etc.). I thought I could get…
G. Ander
  • 180
  • 1
  • 9
15
votes
3 answers

What does "store-buffer forwarding" mean in the Intel developer's manual?

The Intel 64 and IA-32 Architectures Software Developer's Manual says the following about re-ordering of actions by a single processor (Section 8.2.2, "Memory Ordering in P6 and More Recent Processor Families"): Reads may be reordered with older…
jacobsa
  • 5,719
  • 1
  • 28
  • 60
14
votes
2 answers

Intel standard library (C++)

Does the Intel compiler have its own standard library, e.g., implementations of std::cout etc. I want to adjust everything for Intel.
Shibli
  • 5,879
  • 13
  • 62
  • 126
14
votes
4 answers

How to read performance counters on i5, i7 CPUs

Modern CPUs have quite a lot of performance counters - http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-system-programming-manual-325384.html how to read them? I'm interested in cache…
user730816
14
votes
1 answer

How to detect E-cores and P-cores in Linux alder lake system?

How can I check particular cpu core belongs to P-core or E-core group? Is there any way to list information about Performance/Energy cores in a running Linux x86_64 alder lake system? Like, Printing any of the sysfs parameters?
14
votes
2 answers

How to explain poor performance on Xeon processors for a loop with both sequential copy and a scattered store?

I stumbled upon a peculiar performance issue when running the following c++ code on some Intel Xeon processors: // array_a contains permutation of [0, n - 1] // array_b and inverse are initialized arrays for (int i = 0; i < n; ++i) { array_b[i] =…
14
votes
1 answer

If I don't use fences, how long could it take a core to see another core's writes?

I have been trying to Google my question but I honestly don't know how to succinctly state the question. Suppose I have two threads in a multi-core Intel system. These threads are running on the same NUMA node. Suppose thread 1 writes to X once,…
Cube Fan
  • 143
  • 4
14
votes
1 answer

how to interpret perf iTLB-loads,iTLB-load-misses

I have a test case to observe perf iTLB-loads,iTLB-load-misses by perf stat -e dTLB-loads,dTLB-load-misses,iTLB-loads,iTLB-load-misses -p 22479 and get the output : Performance counter stats for process id '22479': 1,262,817 dTLB-loads…
barfatchen
  • 1,630
  • 2
  • 24
  • 48
14
votes
2 answers

What is a Partial Flag Stall?

I was just going over this answer by Peter Cordes and he says, Partial-flag stalls happen when flags are read, if they happen at all. P4 never has partial-flag stalls, because they never need to be merged. It has false dependencies instead. Several…
Evan Carroll
  • 78,363
  • 46
  • 261
  • 468
14
votes
1 answer

Why does this code link on Intel Compiler 2015 but not Intel Compiler 2018?

My team recently upgraded from the 2015 Intel Compiler (parallel studio) to the 2018 version, and we're having a linker issue that has everyone tearing their hair out. I have the following class (moderately redacted for brevity) for handling…
stix
  • 1,140
  • 13
  • 36