Re: BZDEV: New snapshot, looptest.cpp (Cray T3E,Cray C++)

From: Todd Veldhuizen (tveldhui@oonumerics.org)
Date: Sun Apr 12 1998 - 12:57:54 EST


Looptest output for Cray T3E, but this time using the native compiler:

Hardware: Cray T3E (mcurie.nersc.gov), single PE, 450 MHz Alpha
Theoretical peak: 900 Mflops/PE
OS: Unicos 2.0.2.24
C++ compiler: Cray C++ 3.0.2.0
Flags: -O3 -hpipeline3 -hunroll -haggress -hscalar2

In-cache:
Mflops/s Description
 330.32 for, indirection, unit stride
 332.27 for, indirection, unit stride, no +=
 305.89 for, indirection, unit stride, backwards loops
 333.71 for, unroll=4, unit stride, constants loaded into temps
 331.99 for, unroll=4, unit stride, constants loaded into temps,
            no +=
  289.3 for, unroll=4, unit stride, constants loaded into temps,
        CSE for index offsets
 316.59 for, unroll=4, unit stride, constants loaded into temps, backwards
 335.05 for, unroll=8, unit stride, constants loaded into temps
 331.92 for, indirection, unit stride, constants into temps
 332.31 for, indirection, non-unit stride
 332.03 for, indirection, non-unit stride, constants loaded into temps
 50.168 while, pointer increment, unit stride
 50.291 while, pointer increment, unit stride,
    constants loaded into temps
 50.176 while, pointer increment, non-unit stride
  64.38 while, pointer increment, unroll=4, non-unit stride,
     constants loaded into temps
  250.6 for, unroll=4, unit stride, constants loaded into temps, prefetching
  332.4 interlaced, for, indirection, unit stride

Out of cache:
Mflops/s Description
 55.659 for, indirection, unit stride
 55.899 for, indirection, unit stride, no +=
 22.108 for, indirection, unit stride, backwards loops
 55.789 for, unroll=4, unit stride, constants loaded into temps
 55.753 for, unroll=4, unit stride, constants loaded into temps,
            no +=
 55.707 for, unroll=4, unit stride, constants loaded into temps,
        CSE for index offsets
 21.704 for, unroll=4, unit stride, constants loaded into temps, backwards
 55.801 for, unroll=8, unit stride, constants loaded into temps
 55.937 for, indirection, unit stride, constants into temps
 55.904 for, indirection, non-unit stride
 55.882 for, indirection, non-unit stride, constants loaded into temps
 32.311 while, pointer increment, unit stride
 32.532 while, pointer increment, unit stride,
    constants loaded into temps
  32.29 while, pointer increment, non-unit stride
 38.709 while, pointer increment, unroll=4, non-unit stride,
     constants loaded into temps
 54.811 for, unroll=4, unit stride, constants loaded into temps, prefetching
 34.219 interlaced, for, indirection, unit stride
--------------------- blitz-dev list --------------------------------
* To subscribe/unsubscribe: mail to majordomo@oonumerics.org, with
"subscribe blitz-dev" or "unsubscribe blitz-dev" in the body of the message
* Blitz++ web page: http://oonumerics.org/blitz/



This archive was generated by hypermail 2b29 : Wed Feb 20 2002 - 04:30:04 EST