Steve Stevenson wrote:
>> The major reason (for numerics) is that the bulk of general-purpose C++
>> compilers do not perform aggressive floating point optimization. There
>> are exceptions (PGI, KAI) but one still needs pragmas to warn off
>> aliasing, etc.
Unfortunately, "agressive" floating point optimization is not
particularly possible. Consider the Dragon book --- which admonishes
to *leave the floating point stuff alone.* Please read Goldberg's
"What Every Computer Scientist Should Know About Floating Point
Arithmetic". Folks, none of the "rules" of arithmetic you know hold in
FP except stuff about the units and commutation. You'll mess up some
programmers hard derived program for the sake of a wrong
I don't think the first message (posted by Roldan Pozo, I believe?)
was talking about the type of aggressive optimizations that are of
concern here, and I don't think anyone is considering those sorts of
dangerous practices as part of their optimization arsenal.
In response to the very original question -- which was what is it
about C++ that precludes high performance -- the answer is not very
much, per se. (The rest of this post is in answer to that original
question.)
C++ provides lots of nice abstraction mechanisms and it is easy for
them to get in the way and obscure performance issues and this is what
people usually bring up when they say C++ is not high performance. On
the other hand, there are ways that one can use C++ abstractions to
enable high-performance. In fact, given that C++ is more expressive
than Fortran, it is possible (and Dan Quinlan has reported some
preliminary work in this regard) for C++ to outperform Fortran by
large factors (by a factor of four or eight, say) -- because the extra
semantic content in C++ expressions can enable more sophisticated
cache-aware loop transformations, for instance.
(There is one abstraction in C++ and C that does per se preclude
high-performance, namely, pointers. However, as Roldan pointed out,
most compilers provide ways around this problem through the restrict
keyword, or noalias pragmas and the like.)
The basic route to high performance is to make use of the
architectural mechanisms in your microprocessor that are there for
high-performance -- cache and pipelining in particular. The ways to
take advantage of cache and pipelining typically manifest themselves
in code as loop blocking, unrolling, register blocking, etc. But
notice, these issues are all language independent. That is, the
blocking, unrolling, and so forth that one does in Fortran to get high
performance can also be done in C++, and with the same results on
performance. The advantages of doing the code in C++ should be
obvious (at least to this audience) -- one gets all the performance of
Fortran, but all the software engineering and code reuse advantages of
C++.
We have implemented a package (MTL) that demonstrates exactly these
points, and it is (finally almost) ready for release. For an early
look, see
http://www.lsc.nd.edu/research/mtl/
Note that we are just in the process of releasing this package -- it
is in its alpha version right now -- normal caveats apply.
Regards,
Andrew Lumsdaine
------------------------------------------------------------------------
Andrew Lumsdaine
Associate Professor email: lums@lsc.nd.edu
Dept. Comp. Sci. & Engr. phone: (219) 631-8716
353 Fitzpatrick Hall fax: (219) 631-9260
University of Notre Dame www: http://www.cse.nd.edu/~lums/
Notre Dame, IN 46556
------------------------------------------------------------------------
This archive was generated by hypermail 2b29 : Wed Feb 20 2002 - 03:20:06 EST