To introduce myself, I work with Andrew Lumsdaine on the Matrix
Template Library.
Kent Budge writes:
> Certainly not. The problem with C++ number crunching isn't the crunching
> itself, but the way the numbers are gathered up for crunching. In other
> words, it's not an issue of how the FPU is being used, but of how the
> cache and registers are being used.
>
Absolutely.
> If it matters at all. I've had a hard time detecting any performance
> hits due to aliasing on my RISC workstation. Cache management is much
> more important.
I think aliasing is only a problem when it prevents the compiler
from performing unrolling. But you're right, cache -- and most importantly
register management -- is more important.
>
> There *is* one optimization difficulty that is fairly C++-specific. If
> one is using a value class, e.g., complex<double>, to build
> expressions, the compiler usually pushes intermediate results back into
> memory rather than holding them in a register as would be the case for
> built-in types. This can result in a *big* performance hit. KAI has
> solved this problem in their compiler and get FORTRAN-like performance
> for computations on complex numbers that use the complex<double>
> class.
>
This optimization -- lightweight object optimization -- is what
makes high performance really possible in our Matrix Template Library.
We use iterators (very small objects) everywhere. All of our algorithms
are written in terms of iterators, there is no indexing through arrays.
With KAI C++, there is NO performance penalty for doing this.
As to register management, you would find our paper about
our Basic Linear Algebra Instruction Set (BLAIS) interesting. We've
used template metaprograms to handle the register-level blocking
in algorithms such as matrix-matrix multiply. You can
find the paper in the publications section of our web page:
http://www.lsc.nd.edu/research/mtl/publications.shtml
Regards,
Jeremy Siek
This archive was generated by hypermail 2b29 : Wed Feb 20 2002 - 03:20:06 EST