write_latch allows writers to prevent incoming read acquisitions from
succeeding while permitting write_lock operations to complete. This is
useful if multiple writers are cooperating in linearizing a non-blocking
data structure and allows for lock state to be copied by the write lock
holder safely.
In collaboration with Paul Khuong and Jaidev Sridhar.
There is an off-by-one for slot ID sizeof(bytelock->readers) + 1.
This patch fixes the handling of this slot ID. Based off a patch
submitted by Albi Kavo <albi.kavo@gma....>.
Add a read-mostly mode, in which entries won't share cache lines with
associated datas (probes, probe_bound, etc).
This makes write operations slower, but make get faster.
This will serve to drive both pointer-based and type-specialized ring
implementations. Some full fences have also been removed as they are
unnecessary with the introduction of X_Y fence interface.
Remove the ring buffer from the struct ck_ring, it is now required to
explicitely pass the ring buffer for enqueue/dequeue. That can be useful for
systems with multiple address spaces.
This operation is short-hand notation for rebuilding
a hash table. This rebuild can occur in the presence
of concurrent readers and will require twice the amount
of memory of the existing hash table until completion.
This borrows from a technique described by Purcell and Harris
in "Non-blocking hashtables with open addressing" technical report.
Essentially, every slot will have an associated local probe maxim,
including tombstones. Highly aggressive workloads may still require
occassional garbage collection.
This function allows for faster insertions into tombstone-heavy
probe sequences by short-circuiting on tombstones rather than
continuing to probe. The user must already guarantee that the
entry being inserted is unique. If a non-unique key is inserted
with this operation, undefined behavior will result.
Could not find suitable use-case and generally doesn't
appear interesting to academics in the existing
form. Maybe it will make a come-back in the future with
fewer memory and latency compromises.
This adds support for CAS_64{_VALUE}, CAS_PTR_2{_VALUE},
LOAD_64, STORE_64 and other primitives built on universal
CAS primitive.
Patch submitted by Olivier Houchard <cognet@FreeBSD>.
The array is optimized for SPMC and fast iteration (though MPMC
transformation is also possible). This is an extremely simple
implementation with support for atomic in-place modification
through put -> remove elimination.
Besides implementing Thumb 2 supports, this fixes incorrect usage of
"cmp" where "cmpeq" was meant.
Patch submitted by Olivier Houchard <cognet@freebsd>.
This operation moves ownership from one hash set object
to another and re-assigns callback functions to developer-specified
values. This allows for dynamic configuration of allocation
callbacks and is necessary for use-cases involving executable code
which may be unmapped underneath the hash set.
The developer is responsible for enforcing barriers and enforcing
the visibility of the new hash set.
When spinning on global counters, it cannot be assumed that is_locked
functions will guarantee atomic to load ordering, an explicit fence
is necessary. is_locked will only guarantee load ordering.
These come in the form of CK_ELIDE_ADAPTIVE_PROTOTYPE,
CK_ELIDE_LOCK_ADAPTIVE and CK_ELIDE_UNLOCK_ADAPTIVE.
Primarily pushing this for the few that are playing with
master.
This is inspired by Andi Kleen's work for adaptive behavior
in the Linux kernel's RTM locks implementation. There are
various differences in the state machine, however. Specifically,
the concept of a retry and a busy-wait has been unified due
to state machine simplification such that any exhausted busy-wait
cycle reverts to a forfeit (a busy-wait is a specialized retry).
Follow-up work will involve allowing for is_locked behavior
to yield what users expect, if called from with-in a transaction
through the wrapper. It is warned that this will come at a performance
penalty.