Re: OON: Usefulness (or otherwise) of valarray for numerical work in C++

From: Gabriel Dos Reis (dosreis@dptmaths.ens-cachan.fr)
Date: Mon Jun 19 2000 - 15:08:30 EST


"Kent Budge" <kgbudge@valinor.sandia.gov> writes:

[...]

| The usefulness of C++ for numerical work is really a separate issue
| from the usefulness of valarray for numerical work. On this point, at
| least, I think you will find near-universal consensus.

Indeed.

[...]

| exactly for this purpose; to
| > allow aggressive optimisation by compilers in various ways e.g. because
| > it is guaranteed to be alias-free. In theory, this should make it
| > potentially faster than ANSI C until the restrict keyword is
| > implemented, should it not?
|
| Yes, that was the idea. I wanted valarray to provide a mechanism for
| expressing any one-loop operation as a single expression which could be
| highly optimized. I also had a vague notion that nested-loop
| expressions could in turn be expressed as single expressions on nested
| template classes, but the experience just wasn't there to see all the
| implications -- you should know that valarray was originally *not* a
| class template, but a pair of classes based on int and double for which
| there *was* some experience. This is because implementations of
| templates were not widely available at the time valarray was first
| proposed.

Or were just a replicata of std::vector<>. Even today, one can find
such implementations.

[...]

|
| valarray was written at a time when vector supercomputers were still
| the sexy leading edge of computing. Unfortunately, the best
| optimization strategy for a vector supercomputer is almost the opposite
| of the best optimization strategy for modern hierarchical-storage
| machines. On a vector supercomputer, you wanted to run the largest
| possible data set past each instruction, so that the vector pipeline
| remained full. On a hierarchical-memory machine, you want to throw the
| largest possible number of instructions at a particular working set of
| data, so that you keep your data in cache (or paged into memory or on
| processor, depending on which level of the memory hierarchy you are
| concerned with.) valarray might conceivably have been helpful for
| optimization on vector machines, because it assumes operations are best
| treated atomically. It's hopeless on modern machines.

I respectfully disagree with these last sentences.
On modern machines, one tries to take full advantage of pipelining. A
good optimizing optimizing C++ implementation can also take advantage
of cache effects. Actually, I think valarray can benefit from those
technologies.

| In any case, it's not at all clear that valarray is the right philosophy.
| valarray was meant to replace loops with expressions, but STL has shown
| that loops can be beautiful.

Whether loops can be beautiful says nothing about the (supposed or not)
efficiency of valarray.
Fortran (warning: I'm not a big fan of Fortran programming, I'm just
interested in practical decisions) introduced "hardwired" loop
constructs, of which valarray is just the conterparts.

[...]

| > Compiler writers; are you taking full advantage of valarray? Does it
| > offer what it suggests?
|
| Arch Robison can answer this better than I, but the short answer to both
| questions is No.

There is at least one implementation that is taking advantage of the
the alias-freeness assumption.

To answer the original question, I would say the numerical computing
community seems to be a very small market and most of the time
implementors try to satisfy the demands of the majority. Thus your
milage may vary depending on the market targeted by your compiler
vendor.

-- Gaby

--------------------- Object Oriented Numerics List --------------------------
* To subscribe/unsubscribe: use the handy web form at
http://oonumerics.org/oon/
* If this doesn't work, please send a note to owner-oon-list@oonumerics.org



This archive was generated by hypermail 2b29 : Wed Feb 20 2002 - 03:20:13 EST