The Zbc extension defines instructions for carryless multiplication that
can be used to accelerate the calculation of CRC checksums. This
technique is described in Intel's whitepaper, "Fast CRC Computation for
Generic Polynomials Using PCLMULQDQ Instruction".
The Zbb extension defines, among other bit manipulation operations, an
instruction for byte-reversing a register (rev8). This is used when
doing endianness swaps.
crc_fold_common_clmul.h defines a macro that reduces a double-word
aligned buffer to 128 bits by folding four 128-bit chunks in parallel
then folding a single 128-bit chunk until less than two remain. This
macro can be reused for all the CRC algorithms with some parametrisation
controlling:
- where the seed is xor-ed into the first fold
- whether an endianness swap is needed on double-words read in
- whether the algorithm is reflected, which affects whether clmulh gives
back the high double word of a result or the low double word
Where the algorithms differ more is in how the final 128-bits is reduced
to a 32/64 bit result (which also changes if the algorithm is reflected)
and how the buffer is made to be double-word aligned.
32-bit CRCs use a Barrett's reduction to reduce the buffer enough to be
double-word aligned and to reduce any excess leftover after folding. As
the different CRC32 algorithms isa-l supports differ in whether the seed
is inverted and function signature, the alignment, excess and
128-bit reduction are defined as macros in crc32_*_common_clmul.h that
the implementations (crc32_*.S) include and surround with
algorithm-specific assembly and precomputed constants. This also makes
it straightforward to reuse the macros to calculate crc16_t10dif.
64-bit CRCs use a table-based reduction to align the buffer and handle
excess. All isa-l's CRC64 algorithms pass arguments in the same order
and invert the seed before & after folding, so crc64_*_common_clmul.h
both contain a macro for defining a CRC64 function with a particular
name. Then each of the crc64_*.S contain a call to that macro along
with the precomputed constants and lookup table.
The .h header files added don't contain C code and so are excluded from
Clang formatting, similarly to the header files defined for aarch64.
Signed-off-by: Daniel Gregory <[email protected]>