This includes fixing acquire semantics on mcs_lock fast path.
This represents an additional fence on the fast path for
acquire semantics post-acquisition.
More specifically, note that in memory models where atomic
operations do not have serializing effects that atomic
read-modify-write operations are modeled as store operations.
These operations serialize atomic-RMW operations with respect
to each other, loads and stores. In addition to this, the
load_depends implementations have been removed.
ck_pr_fence_{load_load,store_store,load_store,store_load} operations
have been added. In addition to this, it is no longer the responsibility
of architecture ports to determine when to emit a specific fence. Instead,
the underlying port will always emit the necessary instructions to
enforce strict ordering. The higher-level include/ck_pr implementation will
enforce whether or not a fence is necessary to be emitted according to
the memory model specified by ck_md (CK_MD_{TSO,RMO,PSO}).
In other words, only ck_pr_fence_strict_* is implemented by the MD-ck_pr
port.
This function allows for explicit execution of all
deferred callbacks in an epoch_record. The primary
motivation is currently for performance profiling
but there are other use-cases where best-effort
semantics could be applied.
Several counter-examples were found which break in the
presence of store-to-load re-ordering. Strict fence
semantics are necessary.
Thanks to Paul McKenney for helpful discussions.
These variants of ck_ring_enqueue_* return the snapshot of queue
length with respect to the linearization point. This can be used to
extract ring size without incurring additional cacheline invalidation
overhead from the writer.
John Wittrock has contributed a phase-fair reader-writer
lock implementation. These locks allow phase fairness
guarantees between readers and writers. This work includes
additional changes and clean-up.
Follow-up work is expected.
Thanks to John Wittrock for patches and Professor Gabriel
Parmer (http://www.seas.gwu.edu/~gparmer/) for advising.
Upon popular request, added a variant of the ticket spinlock
with trylock support. This is pending additional verification
on other architectures besides x86*. It is still unclear whether
this implementation will be the default as it is has slower
fast path.
Add trylock support to the ck_spinlock validation tests.
It currently only tests ck_spinlock_ticket_t trylock
functionality if available.