ck_ec implements 32- and (on 64 bit platforms) 64- bit event
counts. Event counts let us easily integrate OS-level blocking (e.g.,
futexes) in lock-free protocols. Waking up waiters only locks in the
OS kernel, and does not happen at all when no waiter is blocked.
Waiters only block conditionally, if the event count's value is
still equal to some prior value.
ck_ec supports multiple producers (wakers) and consumers (waiters),
and, on x86-TSO, has a more efficient specialisation for single
producer mode. In the latter mode, the overhead compared to a version
counter is on the order of 2-3 cycles and 1-2 instructions, in the
fast path. The slow path, when there are threads blocked on the event
count, consists of one additional atomic instruction and a futex
syscall.
Similarly, the fast path for consumers, when an update comes quickly,
has no overhead compared to spinning on a read-only counter. After
a few thousand cycles, consumers (waiters) enter the slow path with
one atomic instruction and a few blocking syscalls.
The single-producer specialisation requires the x86-TSO memory model,
x86's non-atomic read-modify-write instructions, and, ideally a
futex-like OS abstraction. On !x86/x86_64 platforms, single producer
increments fall back to the multiple producer code path.
Fixes https://github.com/concurrencykit/ck/issues/79
On FreeBSD, atomic operations in the kernel must access the nucleus
address space. Userland may use either the atomic instruction set
which goes without an ASI (address space identifier) or specify the
primary address space.
To avoid hardcoding the address space here, we grab the corresponding
identifier from the appropriate machine header but also only for the
kernel so the namespace doesn't get polluted for userland.
This work is from jtl@FreeBSD.org. FreeBSD expects to call ck_epoch_poll
from a record that is in an active section. Previously, it was
considered an API violation to call write-side functions while in a read
section.
This is now permitted for poll as we we serialize behind the global
epoch counter.
Note that these functions are not reentrant. In the case of the
FreeBSD kernel, all these functions are called with preemption disabled.
This work is from Jonathan T. Looney from the FreeBSD project
(jtl@).
The return value of ck_epoch_poll has also changed. It returns false
only if the epoch counter has not progressed, if no memory was reclaimed
(in other words, no forward progress) and / or not all threads have been
observed in a quiescent state (grace period).
Below are his notes:
Epoch calls are stored in a 4-bucket hash table. The 4-bucket hash table
allows for calls to be stored for three epochs: the current epoch and
two previous ones. The comments at the beginning of ck_epoch.c explain
why this is necessary.
When there are active threads, ck_epoch_poll_deferred() current runs the
epoch calls for the current global epoch + 1. Because of modulo
arithmetic, this is equivalent to running the calls for epoch - 3.
However, this means that ck_epoch_poll_deferred() is waiting
unnecessarily long to run epoch calls.
Further, there could be races in incrementing the global epoch. Imagine
all active threads have observed epoch n. CPU 0 sees this. It increments
the global epoch to (n + 1). It runs the epoch calls for (n - 3). Now,
CPU 1 checks. It sees that there are active threads which have not yet
observed the new global epoch (n + 1). In this case,
ck_epoch_poll_deferred() will return without running any epochs. In the
worst case (CPU 1 continually losing the race), these epoch calls could
be deferred indefinitely.
To fix this, always run any epoch calls for global epoch - 2. Further,
if all active threads have observed the global epoch, run epoch calls
for global epoch - 1.
The global epoch is only incremented when all active threads have
observed it. Therefore, all active threads must always have observed
global epoch - 1 or the current global epoch. Accordingly, it is safe to
always run epoch calls for global epoch - 2.
Further, if all active threads have observed the global epoch, it is
safe to run epoch calls for global epoch - 1.
Don't attempt to be to smart, and just follow the algorithm, failing to
do so may lead to getting a thread to wrongly believe it owns the lock
when it does not.
This should fix the random failures reported on PPC with many threads.
These new macros are very convenient for modifying a SLIST after
using CK_SLIST_FOREACH_PREVPTR to find an element to remove
or a position to insert.
FreeBSD sys/queue.h already has SLIST_REMOVE_PREVPTR.
I would like to use the new macros in a change that I am planning
for FreeBSD kernel:
https://reviews.freebsd.org/D16016https://reviews.freebsd.org/D15905
build: add linux-ppc64le target.
There appears to be a regression on the target localized to epoch section optimization. I will need to investigate further.
* Implement ck_pr_dec_is_zero family of functions
* include/ck_pr.h: add ck_pr_{dec,inc}_is_zero and implement ck_pr_{dec,inc}_zero in terms of the new functions. Convert the architecture-specific implementations of ck_pr_foo_zero for x86 and x86-64 to ck_pr_foo_is_zero.
* regressions/ck_pr/validate: add smoke tests for ck_pr_dec_{,is_}zero and ck_pr_inc_{,is_}zero
* doc: document ck_pr_inc_is_zero
build: Tea CI integration.
* build: Rename travis.sh to ci-build.sh
* build: Add .drone.yml for Tea CI
* README: Format as Markdown
* README: Add CI badges
In an attempt to prevent gcc from emiting warnings, ck_pr_md_load_ptr and
ck_pr_md_store_ptr were made wrong in commit
5ae12a19d0.
load_ptr would return target instead of *target, and store would store the
value in target instead of in *target.
This is an attempt at fixing this, while still trying to avoid warnings.
This primarily affects the FreeBSD kernel, where the popcount builtin
can be problematic (relies on compiler-provided libraries). See the
history of __POPCNT__ for details [1].
- A new flag, CK_MD_CC_BUILTIN_DISABLE, can be set to indicate that CK
should not rely on compiler builtins when possible.
- ck_cc_clz has been removed, it was unused.
- ck_internal_bsf has been removed, it was duplicate of ck_cc_ffs but broken,
replaced in favor of ck_cc_ffs. Previous consumers were using the bsf
instruction, eitherway.
- ck_{rhs,hs,ht} have been updated to use ck_cc_ffs*.
If FreeBSD requires the builtins for performance reasons, we will lift the
appropriate detection into ck_md (at least, bt* bs* family of functions don't
have the same problems on most targets unlike popcount).
1: https://lists.freebsd.org/pipermail/svn-src-head/2015-March/069663.html
A note has also been added around some ambiguity with respect to WC
memory and relaxed memory semantics (so, heavier-weight mfence semantics
for strict acquire-release interface).
All fences related to atomic operations have been removed as they were
just unnecessary, and so, confusing.