Looptest output for Cray T3E, but this time using the native compiler:
Hardware: Cray T3E (mcurie.nersc.gov), single PE, 450 MHz Alpha
Theoretical peak: 900 Mflops/PE
OS: Unicos 2.0.2.24
C++ compiler: Cray C++ 3.0.2.0
Flags: -O3 -hpipeline3 -hunroll -haggress -hscalar2
In-cache:
Mflops/s Description
330.32 for, indirection, unit stride
332.27 for, indirection, unit stride, no +=
305.89 for, indirection, unit stride, backwards loops
333.71 for, unroll=4, unit stride, constants loaded into temps
331.99 for, unroll=4, unit stride, constants loaded into temps,
no +=
289.3 for, unroll=4, unit stride, constants loaded into temps,
CSE for index offsets
316.59 for, unroll=4, unit stride, constants loaded into temps, backwards
335.05 for, unroll=8, unit stride, constants loaded into temps
331.92 for, indirection, unit stride, constants into temps
332.31 for, indirection, non-unit stride
332.03 for, indirection, non-unit stride, constants loaded into temps
50.168 while, pointer increment, unit stride
50.291 while, pointer increment, unit stride,
constants loaded into temps
50.176 while, pointer increment, non-unit stride
64.38 while, pointer increment, unroll=4, non-unit stride,
constants loaded into temps
250.6 for, unroll=4, unit stride, constants loaded into temps, prefetching
332.4 interlaced, for, indirection, unit stride
Out of cache:
Mflops/s Description
55.659 for, indirection, unit stride
55.899 for, indirection, unit stride, no +=
22.108 for, indirection, unit stride, backwards loops
55.789 for, unroll=4, unit stride, constants loaded into temps
55.753 for, unroll=4, unit stride, constants loaded into temps,
no +=
55.707 for, unroll=4, unit stride, constants loaded into temps,
CSE for index offsets
21.704 for, unroll=4, unit stride, constants loaded into temps, backwards
55.801 for, unroll=8, unit stride, constants loaded into temps
55.937 for, indirection, unit stride, constants into temps
55.904 for, indirection, non-unit stride
55.882 for, indirection, non-unit stride, constants loaded into temps
32.311 while, pointer increment, unit stride
32.532 while, pointer increment, unit stride,
constants loaded into temps
32.29 while, pointer increment, non-unit stride
38.709 while, pointer increment, unroll=4, non-unit stride,
constants loaded into temps
54.811 for, unroll=4, unit stride, constants loaded into temps, prefetching
34.219 interlaced, for, indirection, unit stride
--------------------- blitz-dev list --------------------------------
* To subscribe/unsubscribe: mail to majordomo@oonumerics.org, with
"subscribe blitz-dev" or "unsubscribe blitz-dev" in the body of the message
* Blitz++ web page: http://oonumerics.org/blitz/
This archive was generated by hypermail 2b29 : Wed Feb 20 2002 - 04:30:04 EST