These add unnecessary complexity to the ck_pr_fence interface.
Instead, it can be safely assumed that developers will use
ck_pr_fence_X to enforce X -> X ordering.
These operations serialize atomic-RMW operations with respect
to each other, loads and stores. In addition to this, the
load_depends implementations have been removed.
ck_pr_fence_{load_load,store_store,load_store,store_load} operations
have been added. In addition to this, it is no longer the responsibility
of architecture ports to determine when to emit a specific fence. Instead,
the underlying port will always emit the necessary instructions to
enforce strict ordering. The higher-level include/ck_pr implementation will
enforce whether or not a fence is necessary to be emitted according to
the memory model specified by ck_md (CK_MD_{TSO,RMO,PSO}).
In other words, only ck_pr_fence_strict_* is implemented by the MD-ck_pr
port.
build:
- configure step will generate relevant CFLAGS.
- build profiles are for convenience (developers can use themu
for cross-compilation).
regressions:
- Renamed ck_barrier unit tests to work-around behavior
of Solaris linker.
- Adopted use of a PTHREAD_CFLAGS variable.
ck_cc:
- Added internal CK_CC_IMM macro for compilers that are
verbose against impossible inline constraints (or limited
optimizers).
ck_pr/x86*:
- Adopted CK_CC_IMM macro.
- Dropped redundant constraints.
This work was mostly completed by Theo Schlossnagle
<jesus@omniti.com>, much thanks to him. He has
also provided access to a machine with Sun Studio 12.
There is a bug first generation AMD Opteron processors'
with cpuid family 0Fh and models less than 40h when it
comes to read-modify write operations after load/store
sequence. Not worth supporting this processor.
If you are on this processor, you can find more information
at: http://bugzilla.kernel.org/show_bug.cgi?id=11305#c2