Questions tagged [cpu-cache]

A CPU-cache is a hardware structure used by the CPU to reduce the average access memory time.

Caching is beneficial once some data elements get re-used.

Caching is a general policy
aimed at eliminating latency,
already paid for next, repetitive re-accessing some already visited
but otherwise "expensive" ( _{read slow} ) resource ( _storage )

Caching does not speed-up memory access.

The maximum a professional programmer can achieve is to pay attention and excercise a due care to allow some latency-masking in a concurrent-mode of code execution, with a carefully issued instructions well beforehand before a forthcoming memory data is indeed consumed, so that the cache-management can release a LRU part and pre-fetch the requested data from slow DRAM.

How it works?

Main memory is usually built with DRAM technology, that allows for big, dense and cheap storage structures. But DRAM access is much slower than the cycle time of a modern CPU (the so called memory wall). A CPU-cache is a smaller memory, usually built with SRAM technology (expensive, but fast) that reduces the amount of accesses to main memory by storing the main memory contents that are likely to be referenced in the near future. Caches exploit a property of programs: the principle of locality, which means adjacent memory addresses are likely to be referenced close in time (spatial locality), and if an address is referenced once, it is more likely to be referenced again soon (temporal locality).

The CPU cache is tagged with an address which are extra SRAM cells. These tag cells indicate the specific address that holds the data. The CPU cache can never mirror the entire system memory so this address must be stored. The index in the array forms a set. The index and the tag can use either physical or virtual (MMU) addresses; leading to the three types PIPT, VIVT, VIPT.

Modern CPUs contain multiple levels of cache. In SMP situations a CPU cache level may be private to a single CPU, a cluster of CPUs or the whole system. Because caching can result in multiple copies of data being present in an SMP system, cache coherence protocols are used to keep data consistent. The VIVT and VIPT type caches can also result in interactions with the MMU (and its cache commonly called a TLB).

Questions regarding CPU cache inconsistencies, profiling or under-utilization are on-topic.

For more information see Wikipedia's CPU-cache article.

Also: tlb, mmu

1011 questions

votes

1 answer

What cache coherence solution do modern x86 CPUs use?

I am somewhat confused with what how cache coherence systems function in modern multi core CPU. I have seen that snooping based protocols like MESIF/MOESI snooping based protocols have been used in Intel and AMD processors, on the other hand…

x86 computer-science cpu-architecture cpu-cache mesi

asked May 31 '20 at 10:23

temp1358

votes

1 answer

Programmatically get accurate CPU cache hierarchy information on Linux

I'm trying to get an accurate description of the data cache hierarchy of the current CPU on Linux: not just the size of individual L1/L2/L3 (and possibly L4) data caches, but also the way they are split or shared across cores. For instance, on my…

c++ c linux cpu-architecture cpu-cache

asked Apr 27 '20 at 08:17

François Beaune

4,270
7
41
65

votes

2 answers

What is reference when it says L1 Cache Reference or Main Memory Reference

So I am trying to learn performance metrics of various components of computer like L1 cache, L2 cache, main memory, ethernet, disk etc as below: Latency Comparison Numbers -------------------------- L1 cache **reference** 0.5…

performance latency cpu-cache system-design

asked Apr 06 '20 at 17:36

Anjani Kumar Agrawal

votes

1 answer

Intel's CLWB instruction invalidating cache lines

I am trying to find configuration or memory access pattern for Intel's clwb instruction that would not invalidate cache line. I am testing on Intel Xeon Gold 5218 processor with NVDIMMs. Linux version is 5.4.0-3-amd64. I tried using Device−DAX mode…

x86 intel cpu-architecture cpu-cache persistent-memory

asked Feb 17 '20 at 16:33

Ana Khorguani

votes

1 answer

Is it possible to read CPU cache hit/miss rate in Android?

android cpu-cache

asked Mar 11 '11 at 17:53

Mohammad Moghimi

4,636
14
50
76

votes

1 answer

Committed Vs Retired instruction

It may be a stupid question but I'm not able to find a clear explanation about these 2 phases of an instruction life. My initial thinking was that they are synonymous but I'm not sure anymore. I start to think that For a load commit and retire…

x86 cpu-architecture cpu-cache instructions

asked Oct 11 '18 at 13:17

haster8558

votes

2 answers

clflush to invalidate cache line via C function

I am trying to use clflush to manually evicts a cache line in order to determine cache and line sizes. I didn't find any guide on how to use that instruction. All I see, are some codes that use higher level functions for that purpose. There is a…

c performance x86 intrinsics cpu-cache

asked Aug 13 '18 at 08:58

mahmood

23,197
49
147
242

votes

3 answers

Globally Invisible load instructions

Can some of the load instructions be never globally visible due to store load forwarding ? To put it another way, if a load instruction gets its value from the store buffer, it never has to read from the cache. As it is generally stated that a load…

x86 cpu-architecture cpu-cache memory-barriers

asked May 30 '18 at 16:56

joz

votes

4 answers

Is stack memory contiguous physically in Linux?

As far as I can see, stack memory is contiguous in virtual memory address, but stack memory is also contiguous physically? And does this have something to do with the stack size limit? Edit: I used to believe that stack memory doesn't has to be…

linux heap-memory virtual-memory cpu-cache stack-size

asked Apr 01 '18 at 05:10

cong

1,105
1
12
29

votes

2 answers

How is an LRU cache implemented in a CPU?

I'm studying up for an interview and want to refresh my memory on caching. If a CPU has a cache with an LRU replacement policy, how is that actually implemented on the chip? Would each cache line store a timestamp tick? Also what happens in a dual…

caching cpu cpu-architecture cpu-cache lru

asked May 03 '14 at 18:50

fred basset

9,774
28
88
138

votes

12 answers

Is it possible to lock some data in CPU cache?

I have a problem.... I'm writing a data into array in the while-loop. And the point is that I'm doing it really frequently. It seems to be that this writing is now a bottle-neck in the code. So as i presume it's caused by the writing to memory. This…

c++ cpu-cache

asked Oct 06 '09 at 14:17

Alex

votes

3 answers

How to receive L1, L2 & L3 cache size using CPUID instruction in x86

I encountered a problem during preparing an assembler x86 project which subject is to write a program getting L1 data, L1 code, L2 and L3 cache size. I tried to find something in Intel Documentation & in the Internet but I failed. THE MAIN…

caching x86 intel cpu-cache cpuid

asked Jan 11 '13 at 17:10

Tomek Janiuk

votes

4 answers

C++ How to force prefetch data to cache? (array loop)

I have loop like this start = __rdtsc(); unsigned long long count = 0; for(int i = 0; i < N; i++) for(int j = 0; j < M; j++) count += tab[i][j]; stop = __rdtsc(); time = (stop - start) * 1/3; Need to check how prefetch data influences…

c++ cpu-cache prefetch

asked Jan 09 '13 at 21:37

lizaczek

votes

5 answers

How to produce the cpu cache effect in C and java?

In Ulrich Drepper's paper What every programmer should know about memory, the 3rd part: CPU Caches, he shows a graph that shows the relationship between "working set" size and the cpu cycle consuming per operation (in this case, sequential reading).…

java c linux cpu-cache

asked Sep 22 '12 at 23:59

dawnstar

votes

2 answers

Allocate static memory in CPU cache in c/c++ : is it possible?

Is it possible to explicitly create static objects in the CPU cache, sort of to make sure those objects always stay in the cache so no performance hit is ever taken from reaching all the way into RAM or god forbid - hdd virtual memory? I am…

c++ c memory-management cpu-cache

asked Jan 13 '12 at 17:04

dtech

47,916
17
112
190

Prev 1 2 3

…

67 68 Next