31

what is TCM memory on ARM processors, is it a dedicated memory which resides next to the processor or just a region of RAM which is configured as TCM??.

if it's a dedicated memory, why can we configure it's location and size?.

bouqbouq
  • 973
  • 2
  • 14
  • 34

1 Answers1

47

TCM, Tightly-Coupled Memory is one (or multiple) small, dedicated memory region that as the name implies is very close to the CPU. The main benefit of it is, that the CPU can access the TCM every cycle. Contrary to the ordinary memory there is no cache involved which makes all memory accesses predictable.

The main use of TCM is to store performance critical data and code. Interrupt handlers, data for real-time tasks and OS control structures are a common example.

if it's a dedicated memory, why can we configure it's location and size

Making it configurable would just complicate the address decoding for all memory accesses while giving no real benefit over a fixed address range. So it was probably easier and faster to just tie the TCM to a fixed address.

Btw, if you are working on a system that has a TCM and you aren't using it yet, try placing your stack there. That usually gives you some percent of performance gain for free since all stack memory accesses are now single cycle and don't pollute the data-cache anymore.

Nils Pipenbrinck
  • 83,631
  • 31
  • 151
  • 221
  • how about it's size, how can we configure it's size since it's a hardware – bouqbouq Jun 12 '15 at 10:00
  • That depend on the exact hardware. There are ARM architectures that let you split the TCM so you can use some of it as TCM and the rest as data-cache. From a chip-designers view, when you design a micro-controller you can of cause decide how large your TCM will be. – Nils Pipenbrinck Jun 12 '15 at 10:15
  • what I understand is, when setting stack on RAM, the processor would fetch the stack RAM data to L1 cache and work with it. but when setting STACK on TCM, data would stay always on TCM (there is no fetch from TCM to L1 cache), so there will be always less cache miss on the running program? or I'm mistaken?? – bouqbouq Jul 30 '15 at 08:48
  • @MakhloufGharbi if your ARM system has a cache, what you wrote is true. TCM memory is always uncached, so you take pressure from the cache usage if you use it for your stack. That will improve overall performance. – Nils Pipenbrinck Jul 30 '15 at 08:51
  • 1
    I'm using cortex R5f processor which has cache. I set stack on TCM and I'm testing some benchmarks. what struck me is that when I set STACK on RAM i got 92 data cache miss. and when I set STACK on TCM i got 91 data caches miss. I'm using event 0x03 to count number of cache miss http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0460d/CHDGGECB.html . what I didn't understnad is that I get better performance on TCM with only 1 cache miss difference and in same time L1 cache is faster then TCM. is it possible that it's not the number of cache miss that I count? – bouqbouq Jul 30 '15 at 09:01
  • 1
    @MakhloufGharbi Your CPU may also have a dedicated data-path to the TCM, so the CPU can access L1 and TCM at the same cycle. That could increase the speed somewhat. Btw, if you only have 91 cache misses your benchmarks likely run completely in the L1 cache. Try bigger data-sets. – Nils Pipenbrinck Jul 30 '15 at 09:36
  • 3
    In STM32 this memory named core-coupled memory CCM. And it is inaccessible for DMA. I think this is important notice for developers. – kyb Oct 11 '17 at 08:21
  • I am working on STM32F777II, was just wondering how can I force the linker to store some specified variables and structures in TCM/CCM only and not in the conventional RAM? – Akay Oct 08 '18 at 07:35
  • so, what is the difference from a SRAM/OCRAM ? – Angelo Dureghello Jan 20 '20 at 21:21
  • I found the comment about stack to be particularly useful. I had been wondering about whether I could use TCM for stack. I plan to try this on an old ARM926EJ-S with TCM and see what happens! – tevaughan Apr 13 '21 at 13:08