CK_CC_PACKED will drop structures to one-byte alignment in certain
cases. Obviously, this will mean bad performance on most architectures.
Thanks to Matt Johnson from https://rigel.crhc.illinois.edu/ for
reporting this problem.
Instead of executing a join on the threads, aggregate existing state
of their counters. This can mean one missed increment per thread,
which is not a big deal.
It has been changed to behave more like an autoconf-generated
configure script. Please see --help for more information.
Some options have been removed including --cflags and --compiler.
Other options have been renamed and have different semantics
now (headers and library specify full path).
Add APIs for doing atomic CAS/load/store/etc on 32-bit platforms. In
some cases this also includes operations on 64-bit integers using
cmpxchg8b. It is possible we could do some additional stuff on larger
integers using SSE, but the goal of this port is to target i586/k5 and
newer processors.
Samy mentions that we may want to do something or another for making
portability easier, but I wasn't paying tons of attention when he was
talking, so I forget what that was all about. Plus it's funny to write
in a commit message.
Haven't done a full run-through of validate / benchmark tests, but
a cursory runthrough seems to indicate all passing.
Previously, we wouldn't build on 32-bit architectures, let alone
configure. This was due in part to some issues where we essentially
ignored setting CFLAGS properly.
In addition, FreeBSD and possibly other BSDs only report i386 for any
32-bit x86 architecture. This has the side-effect that we have to do
some additional guesswork to determine the actual CPU. Since we make use
of cmpxchg8b, we require an i586 machine. FreeBSD's default 32-bit gcc
-march setting is i486, so in effort to make things easier for FreeBSD
users on 32-bit, set that to i586 by default.
Linux makes this a little easier for us, since its uname actually
returns useful information about the architecture (and assumedly the
compilers for that architecture target the same arch at a minimum), so
we will refuse to work on i386 / i486 on Linux as well.
But really, I'd be slightly surprised at a ton of use on pentium/k5.
Moved rdtsc and affinity logic to a single file which other
regression tests use. Single point of reference will ease
porting these to future architectures and platforms. Removed
invalid Copyright statement.
Added CK_CC_USED to force some code generation that I found
useful for debugging.
Added ck_stack latency tests and a modified version of djoseph's
modifications to benchmark.h for spinlock latency tests.