This has been on the TODO for a while and helps reduce
read-side retries. It also has the advantage of providing
true wait-freedom on insertion (including termination safety).
The tombstone and version counter update invariant was not respected
in all necessary places.
- If a concurrent load operation is preempted after observing
the version counter and key field of a slot, then the slot is moved
and re-used by another key-value pair, the load operation would
observe an inconsistent pair without the relevant version counter
update.
- On RMO architectures, a store fence was missing on the delete path
(tombstone placement must always be followed by version counter
update).
This makes configure smell more like standard configure scripts and
easier for larger build systems to control the nature of the build.
When building non-pic, don't build the shared object, as a non-pic
shared object doesn't make a huge amount of sense.
Add a read-mostly mode, in which entries won't share cache lines with
associated datas (probes, probe_bound, etc).
This makes write operations slower, but make get faster.
In ck_rhs_do_backward_shift_delete(), find if any entry with the same hash
is stored further, and if not, update probe_bound for every entry being
shifted, instead of just doing it for the slot being emptied.
This inefficiency was introduced in the overhaul of
the ck_epoch API.
Synchronize is executed with respect to e. At e + 1,
references can only exist to objects logically deleted
at e or e + 1. At e + 2, however, references can only exist to
objects logically deleted at e + 1 and e + 2. In the case that a
thread observes an out of date epoch value, an increment to the
global epoch would fail as the active bit is ordered with respect
to the memory barrier in synchronize. In the case that a protected
section begins after the memory barrier, then it is guaranteed
to not acquire the hazardous reference.
This does not change granularity of deferral lists, however.
There is still a requirement of 3 deferral lists on the fast path
(4 in ck_epoch for fast path purposes) as at any moment, any given
deferral list for value e can contain references to objects with
active references from both e and e - 1.
This operation is short-hand notation for rebuilding
a hash table. This rebuild can occur in the presence
of concurrent readers and will require twice the amount
of memory of the existing hash table until completion.
This borrows from a technique described by Purcell and Harris
in "Non-blocking hashtables with open addressing" technical report.
Essentially, every slot will have an associated local probe maxim,
including tombstones. Highly aggressive workloads may still require
occassional garbage collection.
This abstracts away pointer packing tricks to a macro,
fixes comparison of pointers to occur in absence of
embedded values and improves robustness of pointer marshaling
for write-side operations for pointers that may already have
had pointer packing tricks played on them.
This function allows for faster insertions into tombstone-heavy
probe sequences by short-circuiting on tombstones rather than
continuing to probe. The user must already guarantee that the
entry being inserted is unique. If a non-unique key is inserted
with this operation, undefined behavior will result.
This assumes adjacent sector prefetch and appears to have minimal
clustering effect. In addition to this, developers are free to define
their own linear probe length by defining CK_HS_PROBE_L1_SHIFT.
Could not find suitable use-case and generally doesn't
appear interesting to academics in the existing
form. Maybe it will make a come-back in the future with
fewer memory and latency compromises.
The array is optimized for SPMC and fast iteration (though MPMC
transformation is also possible). This is an extremely simple
implementation with support for atomic in-place modification
through put -> remove elimination.
This operation moves ownership from one hash set object
to another and re-assigns callback functions to developer-specified
values. This allows for dynamic configuration of allocation
callbacks and is necessary for use-cases involving executable code
which may be unmapped underneath the hash set.
The developer is responsible for enforcing barriers and enforcing
the visibility of the new hash set.
It is possible for a defragmenting set or swap operation
to set a tombstone. If the probe sequence does not encounter
an empty slot and hits maximum write-side probe limit first
for it to fail to reprobe defragmenting store.