0

This question is about memory consistency. There is an example below that might help if it's unclear.

A problem I am looking at asks for code that can do something when executed on Power/ARM that it could not on the Sparc under RMO. Is this possible, please? Many thanks.

[The hint I am given is that Power/ARM don't do atomic stores. This is in the sense that a store can appear in different L1 caches at different times, rather than in the sense that it's possible to get a view of a partly executed write. I think the hint might not be right because RMO does not preserve load order, and that in turn can do anything that a non-atomic write could?]

To clarify the question, suppose I asked about TSO rather than RMO. The answer could have four threads: i) x=1, ii) y=2, iii) r1=[x]; r2=[y]; iv) r3=[y]; r4=[x]. Variables x and y are initialised to 0. The outcome r1, r2, r3, r4 = 1, 0, 2, 0 would not be possible under TSO which only delays writes (and does not reorder anything). Either the assignment to x or y happened first, and the result is inconsistent with either of those possibilities. The outcome can however occur on the ARM because different CPUs can see different writes at different times.

Joe Huha
  • 548
  • 3
  • 16
  • what category of differences are you looking for? what language are you interested in because "code" ultimately is machine code and none of it crosses architectures. I assume you are not looking for mips/mhz or mips/watt or things like that, something more functional? – old_timer May 29 '15 at 12:46
  • I am looking for differences in how the different architectures handle memory complexity. One way of answering the questions would be in pseudo-code since it should only need reads and writes, and all processors can do those (even if the instructions are different). So I would like a code fragment that can produce a possible outcome on Power/ARM that it could not under RMO. This will obviously involve more than one thread. See example in next comment. – Joe Huha May 29 '15 at 12:52
  • I assume you want more than what is on wikipedias chart? – old_timer May 29 '15 at 12:57
  • since you didnt specify, and correct me if I am wrong, TSO = total store order, RMO = relaxed-memory order, PSO = partial store order. From same wikipedia page. – old_timer May 29 '15 at 12:59
  • For example, had I asked about TSO rather than RMO, the answer would have four threads: i) x=1, ii)y=2, iii) r1=[x]; r2=[y]; iv) r3=[y]; r4=[x]. With x and y initialised to 0. The outcome r1, r2, r3, r4 = 1, 0, 2, 0 would not be possible because TSO only delays writes (and does not reorder anything), so either the assignment to x or y happened first and the result is inconsistent with either of those possibilities. It can however happen on the ARM because different CPUs can see different writes at different times. – Joe Huha May 29 '15 at 13:00
  • @dwelch I'd hardly be aware that RMO does not preserve load order if I didn't know what it stood for. Is it really sensible to assume that someone who asks about atomic writes and load reordering cannot use Google? – Joe Huha May 29 '15 at 13:05
  • there are a signficant number of folks that use SO that apparently cannot use google, just as many cannot even search within SO before asking a question, not saying you can or cant, I assume you can but it is very reasonable assumption at SO – old_timer May 29 '15 at 13:25
  • Please edit your question directly. to further clarify what you are after, something that is too broad or something that is strictly opinion based will eventually get closed/discarded. – old_timer May 29 '15 at 13:26
  • @dwelch sadly, in addition to people who ask questions without searching for the answer first, there are also many people who will insist on answering questions without reading them. It's unfortunate. – Joe Huha May 29 '15 at 13:33
  • I can't really give an answer myself, but I can suggest [this paper](http://www0.cs.ucl.ac.uk/staff/j.alglave/papers/toplas14.pdf), which is a really interesting bit of work and goes into a fair bit of detail comparing memory models. – Notlikethat May 29 '15 at 18:31
  • What makes you think there's a difference between Sparc RMO and ARM to begin with? Anyway, in practice, see section 5.7.2 [here](https://books.google.co.il/books?id=NBR-LDyeuu4C&pg=PA55&lpg=PA55&dq=sparc+rmo+store+ordering+primer&source=bl&ots=Mua8RzT14o&sig=DxjWdY7aBgj2fwbwJduqlA0OM8s&hl=iw&sa=X&ei=wsppVb7nCYGOU_bBgdgN&ved=0CCcQ6AEwAQ#v=onepage&q=sparc%20RMO&f=false), it indicated Sparc implementations to date are still using TSO even when asked for RMO, although i'm not too sure about this. – Leeor May 30 '15 at 14:56
  • The question implies I am rather skeptical as to whether there is a difference at all. But a practice problem I am looking at suggests there somehow is. It's supposedly related to write atomicity. I don't see how that could happen, but I was hoping someone would tell me either way. In the light of this the fact the RMO is now obsolete on the Sparc does not really matter. – Joe Huha May 30 '15 at 19:47
  • Interesting but if they produce different results that would defeat the purpose of a consistency model in the first place. I mean, they all should give you the same result since they are all *consistency* models. Yes, they're implemented differently (some so strict they're inhibiting the CPU; others so relaxed they end up mis-speculating alot...etc). Nevertheless, the end result should be the same: consistent memory. – waleed Jan 21 '16 at 01:58

0 Answers0