This makes configure smell more like standard configure scripts and
easier for larger build systems to control the nature of the build.
When building non-pic, don't build the shared object, as a non-pic
shared object doesn't make a huge amount of sense.
Add a read-mostly mode, in which entries won't share cache lines with
associated datas (probes, probe_bound, etc).
This makes write operations slower, but make get faster.
In ck_rhs_do_backward_shift_delete(), find if any entry with the same hash
is stored further, and if not, update probe_bound for every entry being
shifted, instead of just doing it for the slot being emptied.
This inefficiency was introduced in the overhaul of
the ck_epoch API.
Synchronize is executed with respect to e. At e + 1,
references can only exist to objects logically deleted
at e or e + 1. At e + 2, however, references can only exist to
objects logically deleted at e + 1 and e + 2. In the case that a
thread observes an out of date epoch value, an increment to the
global epoch would fail as the active bit is ordered with respect
to the memory barrier in synchronize. In the case that a protected
section begins after the memory barrier, then it is guaranteed
to not acquire the hazardous reference.
This does not change granularity of deferral lists, however.
There is still a requirement of 3 deferral lists on the fast path
(4 in ck_epoch for fast path purposes) as at any moment, any given
deferral list for value e can contain references to objects with
active references from both e and e - 1.
This operation is short-hand notation for rebuilding
a hash table. This rebuild can occur in the presence
of concurrent readers and will require twice the amount
of memory of the existing hash table until completion.
This borrows from a technique described by Purcell and Harris
in "Non-blocking hashtables with open addressing" technical report.
Essentially, every slot will have an associated local probe maxim,
including tombstones. Highly aggressive workloads may still require
occassional garbage collection.
This abstracts away pointer packing tricks to a macro,
fixes comparison of pointers to occur in absence of
embedded values and improves robustness of pointer marshaling
for write-side operations for pointers that may already have
had pointer packing tricks played on them.
This function allows for faster insertions into tombstone-heavy
probe sequences by short-circuiting on tombstones rather than
continuing to probe. The user must already guarantee that the
entry being inserted is unique. If a non-unique key is inserted
with this operation, undefined behavior will result.
This assumes adjacent sector prefetch and appears to have minimal
clustering effect. In addition to this, developers are free to define
their own linear probe length by defining CK_HS_PROBE_L1_SHIFT.
Could not find suitable use-case and generally doesn't
appear interesting to academics in the existing
form. Maybe it will make a come-back in the future with
fewer memory and latency compromises.
The array is optimized for SPMC and fast iteration (though MPMC
transformation is also possible). This is an extremely simple
implementation with support for atomic in-place modification
through put -> remove elimination.
This operation moves ownership from one hash set object
to another and re-assigns callback functions to developer-specified
values. This allows for dynamic configuration of allocation
callbacks and is necessary for use-cases involving executable code
which may be unmapped underneath the hash set.
The developer is responsible for enforcing barriers and enforcing
the visibility of the new hash set.
It is possible for a defragmenting set or swap operation
to set a tombstone. If the probe sequence does not encounter
an empty slot and hits maximum write-side probe limit first
for it to fail to reprobe defragmenting store.
This function allows for explicit execution of all
deferred callbacks in an epoch_record. The primary
motivation is currently for performance profiling
but there are other use-cases where best-effort
semantics could be applied.
This is similar to the set of changes done to ck_ht.
In addition to this, a bug was caught were minimal
probe sequence was based on last rather than first
tombstone observed.
This hashset will now face long-running bursty
write -> delete -> write workloads much more effectively
and includes significant performance improvements for
delete heavy workloads (at least 2x measured).
Previously, a probe to an empty slot was necessary in order
to avoid hash table growth. As long as a tombstone is available,
re-use it. This prevents excessive growth on workloads involving
long bursts of writes (near 0.5 load factor) followed by long
bursts of deletes.