OON: Re: oon-digest V1 #22

From: Andrew Lumsdaine (lums@lsc.nd.edu)
Date: Wed Jul 29 1998 - 16:31:44 EST


Steve Stevenson wrote:

>> The major reason (for numerics) is that the bulk of general-purpose C++
>> compilers do not perform aggressive floating point optimization. There
>> are exceptions (PGI, KAI) but one still needs pragmas to warn off
>> aliasing, etc.

   Unfortunately, "agressive" floating point optimization is not
   particularly possible. Consider the Dragon book --- which admonishes
   to *leave the floating point stuff alone.* Please read Goldberg's
   "What Every Computer Scientist Should Know About Floating Point
   Arithmetic". Folks, none of the "rules" of arithmetic you know hold in
   FP except stuff about the units and commutation. You'll mess up some
   programmers hard derived program for the sake of a wrong

I don't think the first message (posted by Roldan Pozo, I believe?)
was talking about the type of aggressive optimizations that are of
concern here, and I don't think anyone is considering those sorts of
dangerous practices as part of their optimization arsenal.

In response to the very original question -- which was what is it
about C++ that precludes high performance -- the answer is not very
much, per se. (The rest of this post is in answer to that original
question.)

C++ provides lots of nice abstraction mechanisms and it is easy for
them to get in the way and obscure performance issues and this is what
people usually bring up when they say C++ is not high performance. On
the other hand, there are ways that one can use C++ abstractions to
enable high-performance. In fact, given that C++ is more expressive
than Fortran, it is possible (and Dan Quinlan has reported some
preliminary work in this regard) for C++ to outperform Fortran by
large factors (by a factor of four or eight, say) -- because the extra
semantic content in C++ expressions can enable more sophisticated
cache-aware loop transformations, for instance.

(There is one abstraction in C++ and C that does per se preclude
high-performance, namely, pointers. However, as Roldan pointed out,
most compilers provide ways around this problem through the restrict
keyword, or noalias pragmas and the like.)

The basic route to high performance is to make use of the
architectural mechanisms in your microprocessor that are there for
high-performance -- cache and pipelining in particular. The ways to
take advantage of cache and pipelining typically manifest themselves
in code as loop blocking, unrolling, register blocking, etc. But
notice, these issues are all language independent. That is, the
blocking, unrolling, and so forth that one does in Fortran to get high
performance can also be done in C++, and with the same results on
performance. The advantages of doing the code in C++ should be
obvious (at least to this audience) -- one gets all the performance of
Fortran, but all the software engineering and code reuse advantages of
C++.

We have implemented a package (MTL) that demonstrates exactly these
points, and it is (finally almost) ready for release. For an early
look, see

  http://www.lsc.nd.edu/research/mtl/

Note that we are just in the process of releasing this package -- it
is in its alpha version right now -- normal caveats apply.

Regards,
Andrew Lumsdaine

------------------------------------------------------------------------
 Andrew Lumsdaine
 Associate Professor email: lums@lsc.nd.edu
 Dept. Comp. Sci. & Engr. phone: (219) 631-8716
 353 Fitzpatrick Hall fax: (219) 631-9260
 University of Notre Dame www: http://www.cse.nd.edu/~lums/
 Notre Dame, IN 46556
------------------------------------------------------------------------



This archive was generated by hypermail 2b29 : Wed Feb 20 2002 - 03:20:06 EST