Looptest output for RS/6000 (powerPC), using KCC:
Hardware: RS/6000 PowerPC Model 43P, 100 MHz (oonumerics.org)
Theoretical peak: 200 Mflops
OS: AIX 4.2
C++ compiler: KCC 3.2e
Backend: XL C++ 3.1.4.4
Flags: -O3
In-cache:
Mflops/s Description
42.622 for, indirection, unit stride
44.101 for, indirection, unit stride, no +=
44.357 for, indirection, unit stride, backwards loops
56.936 for, unroll=4, unit stride, constants loaded into temps
55.689 for, unroll=4, unit stride, constants loaded into temps,
4 read then 4 write
57.798 for, unroll=4, unit stride, constants loaded into temps,
no +=
59.143 for, unroll=4, unit stride, constants loaded into temps,
CSE for index offsets
57.364 for, unroll=4, unit stride, constants loaded into temps, backwards
58.688 for, unroll=8, unit stride, constants loaded into temps
57.364 for, indirection, unit stride, constants into temps
44.616 for, indirection, non-unit stride
51.901 for, indirection, non-unit stride, constants loaded into temps
40.799 while, pointer increment, unit stride
46.806 while, pointer increment, unit stride,
constants loaded into temps
38.925 while, pointer increment, non-unit stride
51.55 while, pointer increment, unroll=4, non-unit stride,
constants loaded into temps
43.349 for, unroll=4, unit stride, constants loaded into temps, prefetching
43.847 interlaced, for, indirection, unit stride
Out of cache:
Mflops/s Description
7.0226 for, indirection, unit stride
7.0278 for, indirection, unit stride, no +=
7.0643 for, indirection, unit stride, backwards loops
6.2475 for, unroll=4, unit stride, constants loaded into temps
7.2248 for, unroll=4, unit stride, constants loaded into temps,
4 read then 4 write
6.2068 for, unroll=4, unit stride, constants loaded into temps,
no +=
6.2332 for, unroll=4, unit stride, constants loaded into temps,
CSE for index offsets
7.2111 for, unroll=4, unit stride, constants loaded into temps, backwards
6.1927 for, unroll=8, unit stride, constants loaded into temps
7.0252 for, indirection, unit stride, constants into temps
6.6878 for, indirection, non-unit stride
6.7042 for, indirection, non-unit stride, constants loaded into temps
6.6667 while, pointer increment, unit stride
6.6621 while, pointer increment, unit stride,
constants loaded into temps
6.532 while, pointer increment, non-unit stride
6.9434 while, pointer increment, unroll=4, non-unit stride,
constants loaded into temps
7.1975 for, unroll=4, unit stride, constants loaded into temps, prefetching
5.4402 interlaced, for, indirection, unit stride
--------------------- blitz-dev list --------------------------------
* To subscribe/unsubscribe: mail to majordomo@oonumerics.org, with
"subscribe blitz-dev" or "unsubscribe blitz-dev" in the body of the message
* Blitz++ web page: http://oonumerics.org/blitz/
This archive was generated by hypermail 2b29 : Wed Feb 20 2002 - 04:30:04 EST