Numerous applications require linear algebra
modulo 2, 3, 5 or another very small prime. For example in the study of
graphs, prank of adjacency matrices is often a useful discriminator.
Integer Smith Normal Form computation by modular methods usually has as
its greatest challenge the determination of the tiny prime contribution
to the invariant factors.
Modular residues may be packed into machine words so that several
arithmetic computations are done in one machine instruction. We report
on several packing schemes including packing into doubles and using
floating point arithmetic, packing into ints and using a combination of
int arithmetic and bit operations, finally a scheme called bitslicing.
We report performance measurements of the packing strategies on several
architectures and discuss tuning issues. Finally we discuss the
software design concerning the incorporation of such packing in the
LinBox C++ template library for exact linear algebra. The question is
how to factor the code so that packed and unpacked vectors are equally
usable by algorithms that are generic with respect to ground field
representation, vector details, and matrix representation.
