I am referring to this question and the top voted answer :-
Why are elementwise additions much faster in separate loops than in a combined loop?
My question is, is there an easy way of determining the number of bits (call it N) that the specific cpu uses for address aliasing for load/store?