Blitz++ loop kernel benchmarks

These benchmark results are for 21 loop kernels (originally from a suite used by IBM engineers to benchmark the RS/6000). The histograms show performance of the Blitz++ classes Vector<T> and Array<T,1> relative to the native Fortran compiler. There are two pairs of graphs: the first pair shows peak performance for in-cache (when all the data fits in the L1 cache), and the second pair show performance for out-of-cache (when the data must be read from main memory).

IBM RS/6000 43P


> 1.0 is faster than Fortran, < 1.0 slower.

Median performance: 97.3% of Fortran for in-cache, 93.5% for out-of-cache.
Mean performance: 93.2% in-cache, 90.7% out-of-cache.
Blitz++ is slower than Fortran 77 on some loop kernels because it doesn't yet do loop jamming. It is slightly faster on some kernels because Blitz++ is more aggressive about loop unrolling.
Compilers: XL Fortran 77 at -O3 KAI C++ at +K3 -O3

Cray T3E (single PE)

These results are courtesy of NERSC, and are for the machine mcurie.lbl.gov:


> 1.0 is faster than Fortran, < 1.0 slower.

Median performance: 98.1% of Fortran for in-cache, 95.7% for out-of-cache.
Mean performance: 88.4% in-cache, 86.4% out-of-cache.
The extreme outliers are exp(y) and sqrt(y). For a couple of loops, Blitz++ is faster than Fortran because it doesn't do loop jamming (very surprising). However, there are other loops where the lack of loop jamming hurts performance.
f90 -O 3 -O aggress -O unroll2 -O pipeline3
KCC +K3 -O3 -backend -hpipeline3 -backend -hunroll2 -backend 

The loop kernels

A $ sign indicates a vector operand. Operands without $ are scalar.
1. $x = sqrt($y)
2. $x=$y/$u
3. $y=$y+$a*$x
4. $x=$a+$b
5. $x=$a*$b
6. $x=u/$a
7. $x=$x+$a
8. $x=u+$a+$b+$c
9. $x=$a+$b+$c+$d
10. $y = u + $a; $x = $a + $b + $c + $d
11. $x=$a+$b+$c+$d; $y=u+$d
12. $x=$a+$b; $y=$a-$b
13. $x=$c+$a*$b
14. $x=$a+$b+$c; $y=$x+$c+u
15. $x=($a+$b)*($c+$d)
16. $x = (u + $a) * (v + $b)
17. $x=u*$a; $y=v*$b
18. $x = $a * $b + $c * $d
19. $x = $x + $a * $b + $c * $d
20. $x=$a*$b+$c*$d; $y=$b+$d
21. $x=$a*$c-$b*$d; $y=$a*$d+$b*$c
22. $x=u*$b; $y=v*$b+w*$a+u*$c
23. $x = exp($e)