“Strong” and “weak” hardware memory models
In Welcome to the Jungle, I predicted that “weak” hardware memory models will disappear. This is true, and it’s happening before our eyes:
- x86 has always been considered a “strong” hardware memory model that supports sequentially consistent atomics efficiently.
- The other major architecture, ARM, recently announced that they are now adding strong memory ordering in ARMv8 with the new sequentially consistent ldra and strl instructions, as I predicted they would. (Actually, Hans Boehm and I influenced ARM in this direction, so it was an ever-so-slightly disingenuous prediction…)
However, at least two people have been confused by what I meant by “weak” hardware memory models, so let me clarify what “weak” means – it means something different for hardware memory models and software memory models, so perhaps those aren’t the clearest terms to use.
By “weak (hardware) memory model” CPUs I mean specifically ones that do not natively support efficient sequentially consistent (SC) atomics, because on the software side programming languages have converged on “sequential consistency for data-race-free programs” (SC-DRF, roughly aka DRF0 or RCsc) as the default (C11, C++11) or only (Java 5+) supported software memory model for software. POWER and ARMv7 notoriously do not support SC atomics efficiently.
Hardware that supports only hardware memory models weaker than SC-DRF, meaning that they do not support SC-DRF efficiently, are permanently disadvantaged and will either become stronger or atrophy. As I mentioned specifically in the article, the two main current hardware architectures with what I called “weak” memory models were current ARM (ARMv7) and POWER:
- ARM recently announced ARMv8 which, as I predicted, is upgrading to SC acquire/release by adding new SC acquire/release instructions ldra and strl that are mandatory in both 32-bit and 64-bit mode. In fact, this is something of an industry first — ARMv8 is the first major CPU architecture to support SC acquire/release instructions directly like this. (Note: That’s for CPUs, but the roadmap for ARM GPUs is similar. ARM GPUs currently have a stronger memory model, namely fully SC; ARM has announced their GPU future roadmap has the GPUs fully coherent with the CPUs, and will likely add “SC load acquire” and “SC store release” to GPUs as well.)
- It remains to be seen whether POWER will adapt similarly, or die out.
Note that I’ve seen some people call x86 “weak”, but x86 has always been the poster child for a strong (hardware) memory model in all of our software memory model discussions for Java, C, and C++ during the 2000s. Therefore perhaps “weak” and “strong” are not useful terms if they mean different things to some people, and I’ve updated the WttJ text to make this clearer.
I will be discussing this in detail in my atomic<> Weapons talk at C&B next week, which I hope to make freely available online in the near future (as I do most of my talks). I’ll post a link on this blog when I can make it available online.
Filed under: C# / .NET, C++, Concurrency, Hardware, Java, Talks & Events