This operation moves ownership from one hash set object
to another and re-assigns callback functions to developer-specified
values. This allows for dynamic configuration of allocation
callbacks and is necessary for use-cases involving executable code
which may be unmapped underneath the hash set.
The developer is responsible for enforcing barriers and enforcing
the visibility of the new hash set.
When spinning on global counters, it cannot be assumed that is_locked
functions will guarantee atomic to load ordering, an explicit fence
is necessary. is_locked will only guarantee load ordering.
These come in the form of CK_ELIDE_ADAPTIVE_PROTOTYPE,
CK_ELIDE_LOCK_ADAPTIVE and CK_ELIDE_UNLOCK_ADAPTIVE.
Primarily pushing this for the few that are playing with
master.
This is inspired by Andi Kleen's work for adaptive behavior
in the Linux kernel's RTM locks implementation. There are
various differences in the state machine, however. Specifically,
the concept of a retry and a busy-wait has been unified due
to state machine simplification such that any exhausted busy-wait
cycle reverts to a forfeit (a busy-wait is a specialized retry).
Follow-up work will involve allowing for is_locked behavior
to yield what users expect, if called from with-in a transaction
through the wrapper. It is warned that this will come at a performance
penalty.
This is an example limitation of fence_X_Y variant. I am
considering extending this to include an acquire extension.
Use a memory fence to force total order in a manner that
will be clearer to other developers who read this.
This did not manifest as a problem on any target architectures
due to their handling of atomic operations (SPARC models it as
both a load and a store, while Power atomic_load ordering was
enforced through a full barrier).
It is possible this will be moved to a self-contained file.
For a majority of architectures, RTM is an unnecessary
implementation-specific optimization.
I accidentally removed ck_pr_fence implicit compiler
barrier semantics in re-structure of ck_pr_fence.
This does affect the correctness of any data structures
in ck_pr_fence or the correctness of consumers of ck_pr
operations where ck_pr serves as linearization points.
The reason it does not affect any CK data structures is
that explicit compiler barriers (whether they are store/load
operations or atomic ready-modify-write operations) always
serve as linearization points.
However, if consumers are doing tricky things like using
these barriers to serialize aliased locations for correctness,
then it is possible for compiler re-ordering to bite them in
the ass.
These add unnecessary complexity to the ck_pr_fence interface.
Instead, it can be safely assumed that developers will use
ck_pr_fence_X to enforce X -> X ordering.
This includes fixing acquire semantics on mcs_lock fast path.
This represents an additional fence on the fast path for
acquire semantics post-acquisition.