Blitz++ loop kernel benchmarks
These benchmark results are for 21 loop kernels (originally from
a suite used by IBM engineers to benchmark the RS/6000).
The histograms show performance of the Blitz++ classes
Vector<T> and Array<T,1> relative to the
native Fortran compiler. There are two pairs of graphs:
the first pair shows peak performance for in-cache (when all
the data fits in the L1 cache), and the second pair show
performance for out-of-cache (when the data must be
read from main memory).
IBM RS/6000 43P
> 1.0 is faster than Fortran, < 1.0 slower.
Median performance: 97.3% of Fortran for in-cache, 93.5% for out-of-cache.
Mean performance: 93.2% in-cache, 90.7% out-of-cache.
Blitz++ is slower than Fortran 77 on some loop kernels because it
doesn't yet do loop jamming. It is slightly faster on some kernels
because Blitz++ is more aggressive about loop unrolling.
Compilers:
XL Fortran 77 at -O3
KAI C++ at +K3 -O3
Cray T3E (single PE)
These results are courtesy of NERSC,
and are for the machine mcurie.lbl.gov:
> 1.0 is faster than Fortran, < 1.0 slower.
Median performance: 98.1% of Fortran for in-cache, 95.7% for out-of-cache.
Mean performance: 88.4% in-cache, 86.4% out-of-cache.
The extreme outliers are exp(y) and sqrt(y).
For a couple of loops, Blitz++ is faster than Fortran because it
doesn't do loop jamming (very surprising). However, there
are other loops where the lack of loop jamming hurts performance.
f90 -O 3 -O aggress -O unroll2 -O pipeline3
KCC +K3 -O3 -backend -hpipeline3 -backend -hunroll2 -backend
The loop kernels
A $ sign indicates a vector operand. Operands without $ are scalar.
1. $x = sqrt($y)
2. $x=$y/$u
3. $y=$y+$a*$x
4. $x=$a+$b
5. $x=$a*$b
6. $x=u/$a
7. $x=$x+$a
8. $x=u+$a+$b+$c
9. $x=$a+$b+$c+$d
10. $y = u + $a; $x = $a + $b + $c + $d
11. $x=$a+$b+$c+$d; $y=u+$d
12. $x=$a+$b; $y=$a-$b
13. $x=$c+$a*$b
14. $x=$a+$b+$c; $y=$x+$c+u
15. $x=($a+$b)*($c+$d)
16. $x = (u + $a) * (v + $b)
17. $x=u*$a; $y=v*$b
18. $x = $a * $b + $c * $d
19. $x = $x + $a * $b + $c * $d
20. $x=$a*$b+$c*$d; $y=$b+$d
21. $x=$a*$c-$b*$d; $y=$a*$d+$b*$c
22. $x=u*$b; $y=v*$b+w*$a+u*$c
23. $x = exp($e)