Questions tagged [cpu-cache]

A CPU-cache is a hardware structure used by the CPU to reduce the average access memory time.

Caching is beneficial once some data elements get re-used.

Caching is a general policy
aimed at eliminating latency,
already paid for next, repetitive re-accessing some already visited
but otherwise "expensive" ( _{read slow} ) resource ( _storage )

Caching does not speed-up memory access.

The maximum a professional programmer can achieve is to pay attention and excercise a due care to allow some latency-masking in a concurrent-mode of code execution, with a carefully issued instructions well beforehand before a forthcoming memory data is indeed consumed, so that the cache-management can release a LRU part and pre-fetch the requested data from slow DRAM.

How it works?

Main memory is usually built with DRAM technology, that allows for big, dense and cheap storage structures. But DRAM access is much slower than the cycle time of a modern CPU (the so called memory wall). A CPU-cache is a smaller memory, usually built with SRAM technology (expensive, but fast) that reduces the amount of accesses to main memory by storing the main memory contents that are likely to be referenced in the near future. Caches exploit a property of programs: the principle of locality, which means adjacent memory addresses are likely to be referenced close in time (spatial locality), and if an address is referenced once, it is more likely to be referenced again soon (temporal locality).

The CPU cache is tagged with an address which are extra SRAM cells. These tag cells indicate the specific address that holds the data. The CPU cache can never mirror the entire system memory so this address must be stored. The index in the array forms a set. The index and the tag can use either physical or virtual (MMU) addresses; leading to the three types PIPT, VIVT, VIPT.

Modern CPUs contain multiple levels of cache. In SMP situations a CPU cache level may be private to a single CPU, a cluster of CPUs or the whole system. Because caching can result in multiple copies of data being present in an SMP system, cache coherence protocols are used to keep data consistent. The VIVT and VIPT type caches can also result in interactions with the MMU (and its cache commonly called a TLB).

Questions regarding CPU cache inconsistencies, profiling or under-utilization are on-topic.

For more information see Wikipedia's CPU-cache article.

Also: tlb, mmu

1011 questions

votes

0 answers

PreLoad Engine (PLE) ARM A9 MPCore

I'm trying to pre-load load SDRAM memory to L2 cache. I have initialised the MMU and made 1 translation table. I also enabled the cache and I see the software is using the cache as well... To load some SDRAM to my L2 cache i tried to work with the…

arm cpu-cache

asked May 01 '14 at 11:16

user3307572

votes

1 answer

SMP boot of ARM Cortex A9 sequence with MMU/cache enabled

I am trying to do SMP boot in U-boot on Dual core ARM Cortex A9 system with MMU/Cache enabled. I needed the sequence of initializations. How should be the sequence of the following things happen. In what order? MMU page table setup Set SMP bit…

arm u-boot cpu-cache mmu cortex-a

asked Apr 23 '14 at 10:17

prasanna

votes

0 answers

What causes the retired instructions to increase?

I have a 496*O(N^3) loop. I am performing a blocking optimization technique where I'm operating 2 images at a time instead of 1. In raw terms, I am unrolling the outer loop. (The non-unrolled version of the code is as shown below: ) b.t.w I'm using…

cpu-architecture cpu-cache instructions intel-vtune loop-unrolling

asked Apr 11 '14 at 18:33

quantumshiv

votes

2 answers

Difference between use of while() and sleep() to put program into sleep mode

I have created a shared object and access it from two different program and measuring the time. DATA array is the shared object between two processes. Case 1: Use of while inside program1 program1 : access shared DATA array ;// to load into memory…

c linux performance-testing cpu-cache

asked Feb 10 '14 at 11:59

bholanath

1,699
1
22
40

votes

1 answer

Unexpected output in C with access to ARRAY in memory with RDTSC

Here is my program in C. #include #include #include #include static int DATA[1024]={1,2,3,4,.....1024}; inline void foo_0(void) { int j; puts("Hello, I'm inside foo_0"); int k=0; …

c linux performance cpu-architecture cpu-cache

asked Feb 10 '14 at 07:43

bholanath

1,699
1
22
40

votes

0 answers

How to flush out the Shared function data from CPU cache

I am creating a shared data for two processes and then after reading data from CPU cache, I want to flush out the shared function data from CPU cache. I am able to find the starting address of that particular shared data in cache memory but unable…

linux memory-management operating-system cpu-cache

asked Jan 31 '14 at 18:52

Amit_T

votes

1 answer

Calculating actual/effective CPI for 3 level cache

(a) You are given a memory system that has two levels of cache (L1 and L2). Following are the specifications: Hit time of L1 cache: 2 clock cycles Hit rate of L1 cache: 92% Miss penalty to L2 cache (hit time of L2): 8 clock cycles Hit rate of L2…

caching cpu-architecture cpu-cache

asked Dec 10 '13 at 00:53

User14229754

votes

1 answer

ARM bare-metal with MMU: write to non-cachable,non-bufferable mapped area fail

I am ARM Cortex A9 CPU with 2 cores. But I just use 1 core and the other is just in a busy loop. I setup the MMU table using section (1MB per entry) like this: 0x00000000-0x14ffffff => 0x00000000-0x14ffffff (non-cachable,…

arm buffer cpu-cache mmu ioremap

asked Nov 20 '13 at 02:05

sing lam

votes

0 answers

Finding cache cpi time

I need a formula or to at least be pointed in the right direction it involves cache and cpi time. I have a base machine that has a 2.4ghz clock rate it has L1 and L2 cache. L1 is 256k direct mapped write through . 90% read without a hit rate without…

caching memory computer-science cpu-architecture cpu-cache

asked Nov 19 '13 at 00:52

MoMo8

votes

2 answers

Which one will workload(usage) of the CPU-Core if there is a persistent cache-miss, will be 100%?

That is, if the core processor most of the time waiting for data from RAM or cache-L3 with cache-miss, but the system is a real-time (real-time thread priority), and the thread is attached (affinity) to the core and works without switching…

caching x86 x86-64 cpu-usage cpu-cache

asked Nov 14 '13 at 14:19

Alex

12,578
15
99
195

votes

2 answers

Memory performance/cache puzzle

I have a memory performance puzzle. I'm trying to benchmark how long it takes to fetch a byte from main memory, and how various BIOS settings and memory hardware parameters influence it. I wrote the following code for Windows that, in a loop,…

c performance cpu cpu-cache

asked Nov 05 '13 at 18:32

Andrew

votes

0 answers

DIfference between eviction due to clflush and eviction due to access to same set by other process

As per my understanding, when we use clflush(&Array1[i]), then we actually manually evict the cache line where this Array1[i] resides and it is guaranteed that the element ,Array1[i] will not present in cache and next time after clflush when we try…

caching operating-system cpu-architecture cpu-cache

asked Oct 22 '13 at 18:43

bholanath

1,699
1
22
40

votes

1 answer

What are exactly memory read write operations of the prossesor

Im sure my title is not perfect so let me clear my self. by this article : http://msdn.microsoft.com/en-us/magazine/jj863136.aspx , void Print() { int d = _data; // Read 1 if (_initialized) // Read 2 Console.WriteLine(d); else …

cpu-cache

asked Oct 11 '13 at 12:48

Stav Alfi

13,139
23
99
171

votes

1 answer

How can i check my CPU cache in Windows 8?

i have a problem: i can not find any panel or command in windows 8 which can show me my CPU cache? there is some softwares can get sysconfig. but those are not full info. it's completely all information except CPU_CACHE.

windows-8 cpu cpu-cache cpu-speed

asked Sep 10 '13 at 16:16

Amin AmiriDarban

2,031
4
24
32

votes

0 answers

Hibernate / Spring transaction issue with Infinispan L2 cache

I am trying to use Infinispan as Hibernate L2 cache for an application which use technologies like Tomcat 6, Hibernate 4 and Spring 3.5. The application running in Tomcat and our current transaction manager is …

hibernate transactions spring-transactions infinispan cpu-cache

asked May 15 '13 at 05:33

era

Prev 1 2 3

…

67 68 Next