1

I am referring to this question and the top voted answer :-

Why are elementwise additions much faster in separate loops than in a combined loop?

My question is, is there an easy way of determining the number of bits (call it N) that the specific cpu uses for address aliasing for load/store?

Community
  • 1
  • 1
owagh
  • 3,428
  • 2
  • 31
  • 53
  • Funny. Googling "partial address aliasing" turns up that question as the second result. I guess there isn't much literature about the topic. – Mysticial May 24 '12 at 05:13
  • Note that the address aliasing is particular to a cache, not a CPU. Most modern CPUs have at least two levels of cache. Furthermore, your question assumes that this is a constant. That's very much an implementation detail. – MSalters May 24 '12 at 09:10
  • Yes it is an implementation detail. I'm looking for a test program that I can run on my CPU and use it to figure out N for that specific CPU and one specific cache level. Or maybe just some piece of documentation someplace that states that. Whichever is easier. – owagh May 25 '12 at 19:12

1 Answers1

0

At the OS level: no. I'm not aware of any standard OS APIs (including anything in Linux or Win32) that give you any user-space visibility to CPU cache.

However, Intel provides some great tools for low-level performance analysis and optimization. For example,

paulsm4
  • 114,292
  • 17
  • 138
  • 190