1

At some point in my code, I need to push the writes in my code all the way to the DIMM or DDR device. My requirement is to ensure the write reaches the row,ban,column of the DDR device on the DIMM. I need to read what I've written to the main memory. I do not want caching to get me the value. Instead after writing I want to fetch this value from main memory(DIMM's).

So far I've been using Intel's x86 instruction wbinvd(write back and invalidate cache). However this means the caches and TLB are flushed. Write-back requests go to the main memory. However, there is a reasonable amount of time this data might reside in the write buffer of the Memory Controller( Intel calls it integrated memory controller or IMC). The Memory Controller might take some more time depending on the algorithm that runs in the Memory Controller to handle writes.

Is there a way I force all existing or pending writes in the write buffer of the memory controller to the DRAM devices ??

What I am looking for is something more direct and more low-level than wbinvd. If you could point me to right documents or specs that describe this I would be grateful.

Generally, the IMC has a several registers which can be written or read from. From looking at the specs for that for the chipset I could not find anything useful.

Thanks for taking the time to read this.

hit.at.ro
  • 306
  • 3
  • 11
  • Heads-up: You may want to check with your employer to ensure that asking questions about this hardware doesn't violate your NDA. –  May 30 '14 at 20:53
  • What about `clflush` for your writes? – Leeor May 30 '14 at 21:02
  • Rohit, you can also mark memory range as Uncacheable (UC) in MTRR. Are you sure that the write buffer in memory controller is what is limiting you? Is it normal DDR/DIMM device controlled by it? What is the exact model of your Processor/IMC? I think, there should be little ability to change IMC algorithms with still writing the data to memory. – osgx May 30 '14 at 21:26
  • @clflush : cacheline flush will not cut it in this case – hit.at.ro May 30 '14 at 21:30
  • Rohit, there are also some counters from IMC, accessible via BAR "(in PCI configuration space) at Bus 0; Device 0; Function 0; Offset 048H.": https://software.intel.com/en-us/articles/monitoring-integrated-memory-controller-requests-in-the-2nd-3rd-and-4th-generation-intel - you can use them to measure latency from write from CPU to the write to DDR banks. – osgx May 30 '14 at 21:33
  • @osgx , its a 4th gen Intel Processor that supports two IMC's. And thanks those are some ideas to play around with for now. Waiting for the latency after reading it seems better but not very reliable :( – hit.at.ro May 30 '14 at 21:47

1 Answers1

1

This is why Intel added CLWB instruction to the set. At the moment when this answer is written, there is no off-the-shelf available hardware which implements the instruction (AFAIK). For the similar reasons, ARMv8.2-A added the "DC CVAP" (Clean data cache by virtual address to Point of Persistence) instruction.

Still, there is no architectural way to flush transactions sitting in IMC. NVDIMM specs care about this problem and provide NVDIMM-specific mechanisms to ensure that writes have reached the device. See the description of Flush Hint Address Structure in the NVDIMM Firmware Interface Table in the recent ACPI specification.

  • Thank you! Can you add some examples of NVDIMM specific commands? And some referenced about CLWB/PCOMMIT command addition, for example the https://danluu.com/clwb-pcommit "Why Intel added the CLWB and PCOMMIT instructions"... – osgx Jun 10 '17 at 02:03