In an attempt to prevent gcc from emiting warnings, ck_pr_md_load_ptr and
ck_pr_md_store_ptr were made wrong in commit
5ae12a19d0.
load_ptr would return target instead of *target, and store would store the
value in target instead of in *target.
This is an attempt at fixing this, while still trying to avoid warnings.
This primarily affects the FreeBSD kernel, where the popcount builtin
can be problematic (relies on compiler-provided libraries). See the
history of __POPCNT__ for details [1].
- A new flag, CK_MD_CC_BUILTIN_DISABLE, can be set to indicate that CK
should not rely on compiler builtins when possible.
- ck_cc_clz has been removed, it was unused.
- ck_internal_bsf has been removed, it was duplicate of ck_cc_ffs but broken,
replaced in favor of ck_cc_ffs. Previous consumers were using the bsf
instruction, eitherway.
- ck_{rhs,hs,ht} have been updated to use ck_cc_ffs*.
If FreeBSD requires the builtins for performance reasons, we will lift the
appropriate detection into ck_md (at least, bt* bs* family of functions don't
have the same problems on most targets unlike popcount).
1: https://lists.freebsd.org/pipermail/svn-src-head/2015-March/069663.html
A note has also been added around some ambiguity with respect to WC
memory and relaxed memory semantics (so, heavier-weight mfence semantics
for strict acquire-release interface).
All fences related to atomic operations have been removed as they were
just unnecessary, and so, confusing.
Add a new configure option, --enable-lse, which is only effective for
the AArch64 architecture. When used, most ck_pr_* atomics will use Large
System Extensions instructions as per the ARMv8.1 specification, rather
then LL/SC instruction pairs.
We don't have to claim we will read the value from variables when we do not,
this was only done to work around a bug on some versions of gcc for arm
a while ago, hopefully this won't be needed here.
This should fix the (harmless) warnings described in issue #83.
These primitives are meant to be used in lock implementations
where control dependency ordering is sufficient to enforce
ordering of critical section. At the moment, this only affects
PPC. Currently, we rely on lwsync for entry into critical sections
which is insufficient. sync is rather heavy-weight, and assuming
we aren't falling victim into compiler re-ordering, isync should
be sufficient.
There is follow-up work to be done in ARM, as we may have cheaper
(but target-specialized) ISB-tricks for load-load ordering.
DECONST_PTR is a hack to deconstify void pointer values
that is safe in presence of -Wcast-qual. CK_CC_RESTRICT
is restrict qualifier that can be disabled for only
partially C99 compliant compilers.
We use some macro trickery to enforce that ck_pr_store_* is actually
storing the correct type into the target variable, without any actual
side effects--by making the assignment into an rvalue and using a
comma expression, the compiler should optimize it away.
On the load side, we simply cast the result to the type of the target
variable for pointer loads.
There is an unsafe version of the store_ptr macro called
ck_pr_store_ptr_unsafe for those times when you are _really_ sure that
you know what you're doing.
This commit also updates some of the source files (ck_ht, ck_hs,
ck_rhs): ck_ht now uses the unsafe macro, as its conversion between
uintptr_t and void * is invalid under the new macros. ck_hs and ck_rhs
have had some casts added to preserve validity.
Commit 554e2f08 removed underscores from _CK prefixes. It missed
the arm bits, breaking the arm build (which checks for the
non-existing _CK_PR_H). Fix it.
While at it, fix the mismatch between _CK_ISB and __CK_ISB; convert
them both to CK_ISB.
Signed-off-by: Emilio G. Cota <cota@braap.org>
This was accidentally grouped into previous commit.
The destiny of this interface for internal use is still unclear (in context of
utilization in built-in data structures). The interface is enabled by default
on x86, as it is compatible with read-side prefetch* operations and
binary-compatible with 3DNow! extension. Older compilers will waste an
additional byte (they generate 3DNow! variant) on this, but older compilers
waste more on spillage if we encode instruction. Power support is coming soon.