Blitz logo

Blitz Support :

From: Julian Cummings (cummings_at_[hidden])
Date: 2004-06-25 20:41:39


Hello Andreas,

I have been looking into the performance issues with these example
programs that you sent.
I see performance numbers similar to yours on my P4 box. The C code
myprogram1 completes in 3.9 sec., whereas the blitz version using
stencil declarations in myprogram3 takes about 5.9 sec. to run. You can
improve somewhat the performance of myprogram3 by replacing the three
constant-value matrices being used in the stencil with actual scalars.
 It is hard to do this using the blitz macros for declaring stencil
objects, but can be done if you write out fully the definition of your
stencil object and modify it to store scalar parameters.

struct spots2D_stencil {
    spots2D_stencil(double sin, double d_ain, double d_bin)
        : s(sin), d_a(d_ain), d_b(d_bin) {}
    template <typename T1, typename T2, typename T3, typename T4,
typename T5,
              typename T6, typename T7, typename T8, typename T9,
typename T10,
              typename T11>
    inline void apply(T1& beta, T2& a, T3& b, T4& a_out, T5& b_out, T6,
        T7, T8, T9, T10, T11) const
    {
        a_out = a + s * (16 - a*b) + d_a * Laplacian2D(a);
        b_out = b + s * (a*b - b - beta) + d_b * Laplacian2D(b);
        if (a_out < 0.0) a_out = 0.0;
        if (b_out < 0.0) b_out = 0.0;
    }
    template <int N>
    inline void getExtent(TinyVector<int,N>& minb,
                          TinyVector<int,N>& maxb) const
    {
        minb = shape(-1,-1);
        maxb = shape(+1,+1);
    }
    enum { hasExtent = 1 };
    const double& s, d_a, d_b;
};

Notice that I have added three members that are const double references
to store the three constant scalar parameters, and a constructor that
initializes these references. With this definition of spots2D_stencil,
you must pass the scalar constants when you construct the stencil object:

    applyStencil(spots2D_stencil(s,d_a,d_b),matrix_beta,
             matrix_a,matrix_b,matrix_a_out,matrix_b_out);

But now we only have five arrays in our stencil, so it is a bit more
efficient. My tests showed a run time of 4.9 sec. with this improvement.

A further improvement would be to add the unit stride optimization to
the implementations of applyStencil(). If the innermost loop has a
stride of 1 for all the Array operands in the stencil, then we can use
advanceUnitStride() to increment all the Array iterators instead of
advance(). Some quick tests that I did indicate that this will reduce
the run time of myprogram3 to about 4.3 sec. Still not quite as fast as
the C code, but fairly close. There will always be some amount of
overhead associated with determining the bounds of the stencil loops in
each dimension and the ordering of the loops, among other things. But I
will try to put in the unit stride optimization for the applyStencil()
function, since it is clearly quite beneficial.

Regards, Julian C.

Andreas R. wrote:

>Hi.
>
>Some time ago I wrote a C program to generate reaction-diffusion textures.
>Now I try to port this to blitz, since blitz's stencil notation is really
>wonderful.
>But I cannot manage to achieve comparable performance (in best case I
>get about 115% overhead of the blitz code compared to C code).
>So I extracted the most important code (omitting dummy values production
>at the boundaries, error handling code, value rescaling, etc.) into
>myprogram1.c and wrote according blitz stencil code into myprogram3.cpp. I use
>a few hacks to keep the code similar and short.
>I use 2D matrices of size 68x68 and perform 30000 iterations (to be sure it is
>not too much affected by the initialization):
>time ./myprogram3 > test3.pgm
>
>real 0m5.406s
>user 0m5.404s
>sys 0m0.002s
>
>compared to C program result:
>time ./myprogram1 > test1.pgm
>
>real 0m2.506s
>user 0m2.505s
>sys 0m0.002s
>
>I then tried to find out the reasons for that. So I took the C's
>implementation and used Array<float, 2> instead of float**
>(myprogram2.cpp), but that made it even worse than stencil notation!
>By the way I didn't find a swap method in Array so I hope:
>matrix_a.swap(matrix_b) for std::vector is equivalent to
>matrix_swap_tmp.reference(matrix_a); matrix_a.reference(matrix_a_out);
>matrix_a_out.reference(matrix_swap_tmp); for blitz::Array.
>time ./myprogram2 > test2.pgm
>
>real 0m6.110s
>user 0m6.108s
>sys 0m0.002s
>
>I took a look at Array's implemetation. It uses a single array, so I
>modified myprogram1.c to do the same, but it didn't affect the execution
>time that much:
>time ./myprogram1b > test1b.pgm
>
>real 0m2.636s
>user 0m2.634s
>sys 0m0.002s
>
>I asked myself if Array is perhaps having too much overhead, so I rewrote
>myprogram2.cpp to use a low overhead version (without reference
>counting, even omitting the freeing of space), but that didn't help much
>either:
>time ./myprogram2b > test2b.pgm
>
>real 0m4.903s
>user 0m4.902s
>sys 0m0.000s
>
>Using std::vector< std::vector<float> > is worse than float** , but better
>than Array:
>time ./myprogram1c > test1c.pgm
>
>real 0m3.779s
>user 0m3.763s
>sys 0m0.002s
>
>I didn't know exactly whether this had something to do with memory
>alignment, so I just adjusted myprogram1b to use one std::vector instead of
>pointer. But this caused still much overhead:
>time ./myprogram1bb > test1bb.pgm
>
>real 0m4.407s
>user 0m4.348s
>sys 0m0.001s
>
>Low-overhead version of std::vector:
>bash-2.05b$ time ./myprogram1bbb > test1bbb.pgm
>
>real 0m4.144s
>user 0m4.128s
>sys 0m0.001s
>
>My last thought was: how would the execution time of the stencil change if I
>use multiple stencils instead of 1 (I would have to use it later, because
>this simple stencil already uses 8 arrays and there are only 11)? So I split
>the stencil and was gladly surprised that the running time decreased and now
>has only 70% overhead over fastest C implementation.
>time ./myprogram3b > test3b.pgm
>
>real 0m4.388s
>user 0m4.268s
>sys 0m0.002s
>
>
>The fastest C++ code I get with std::vector< std::vector<float> > still has
>50% overhead (I compare user times).
>It seems that the simple use of C++ operators causes the code to get
>that slow. Is this really the case?
>
>Previously I actually thought that using std::vector would be just as fast as
>accesses to float*, since (at least when using -O3 or -finline-functions) the
>code should be inlined. Or is there something else? Can somebody verify my
>experience?
>
>I read in a forum
> http://lists.trolltech.com/qt-interest/2004-04/thread00525-0.html
>that using double instead of float saves some time (this was the case on my
>machine), I also tried very many optimization options, so the times above are
>actually the times of my latest code (with the doubles instead of
>floats) and optimization options I found out to work best:
>-O3 -march=pentium4 -mcpu=pentium4 -msse -msse2 -ffast-math
>-fomit-frame-pointer -mfpmath=sse -malign-double
>Or are there perhaps some other options I could enable to speed up the
>execution on g++? Should I try Intel's compiler?
>
>I use
>gcc/g++ (GCC) 3.3.3 20040412 (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6)
>on Linux 2.6.5-gentoo-r1 i686 Intel(R) Pentium(R) 4 CPU 2.60GHz GenuineIntel
>cache size 512 KB and 1GB RAM.
>
>I attach the programs, so you can verify that on your machines if you
>want (you will probably need to modify the OPTIMIZ_FLAGS and substitute
>-march=pentium4 -mcpu=pentium4 with your architecture, then just type
> "make tests"). If you want to see the produced picture, open it with xv,
>right click on it, select Windows->Color Editor, click on HistEq (almost
>bottom, left).
>
>Any advice would be appreciated.
>
>Regards, Andreas R.
>
>
>------------------------------------------------------------------------
>
>CXX = g++
>CC = gcc
>
>BZDIR = /soft/common/software
>BZINC = -I$(BZDIR)/include
>
>#OPTIMIZ_FLAGS = -O3 -march=pentium4 -mfpmath=sse -funroll-loops
>#OPTIMIZ_FLAGS = -O3 -march=pentium4 -mfpmath=sse -funroll-loops -mmmx -msse -msse2 # -malign-double
>OPTIMIZ_FLAGS = -O3 -march=pentium4 -mcpu=pentium4 -msse -msse2 -ffast-math -mfpmath=sse -malign-double -fomit-frame-pointer -DNDEBUG -DNO_DEBUG -DNODEBUG # -mno-ieee-fp -fforce-addr -fforce-mem -maccumulate-outgoing-args -finline-limit=2048
># -no-branch-probabilities -finline-functions
>CFLAGS = -Wall -g $(OPTIMIZ_FLAGS)
>CXXFLAGS = -ftemplate-depth-30 $(CFLAGS) $(BZINC)
>
>LIBS = -L$(BZDIR)/lib -lm -lblitz
>#LIBS = -lm -lblitz
>LDFLAGS =
>
>CXXTARGETS = myprogram1c myprogram1bb myprogram1bbb myprogram2 myprogram2b myprogram3 myprogram3b
>TARGETS = myprogram1 myprogram1b $(CXXTARGETS)
>
>.SUFFIXES: .o .cpp .c
>
>all: $(TARGETS)
># $(TARGETS)
>
># $(LIBS) must be at the end, otherwise:
># undefined reference to `blitz::_dummyArray'
>$(CXXTARGETS): Makefile
> $(CXX) $(LDFLAGS) $(CXXFLAGS) $@.cpp -o $@ $(LIBS)
>
>myprogram1.o myprogram1b.o: Makefile
>
>myprogram1c: myprogram1c.cpp
>myprogram1bb: myprogram1bb.cpp
>myprogram1bbb: myprogram1bbb.cpp
>myprogram2: myprogram2.cpp
>myprogram2b: myprogram2b.cpp
>myprogram3: myprogram3.cpp
>myprogram3b: myprogram3b.cpp
>
>clean:
> rm -f *.o $(TARGETS)
>distclean: clean
> rm -f *~
>
>tests: all timings verify
>timings:
> time ./myprogram1 > test1.pgm
> time ./myprogram1b > test1b.pgm
> time ./myprogram1bb > test1bb.pgm
> time ./myprogram1bbb > test1bbb.pgm
> time ./myprogram1c > test1c.pgm
> time ./myprogram2 > test2.pgm
> time ./myprogram2b > test2b.pgm
> time ./myprogram3 > test3.pgm
> time ./myprogram3b > test3b.pgm
>verify:
> diff test1.pgm test1b.pgm \
> && diff test1.pgm test1bb.pgm \
> && diff test1.pgm test1bbb.pgm \
> && diff test1.pgm test2.pgm \
> && diff test1.pgm test2b.pgm \
> && diff test1.pgm test3.pgm \
> && diff test1.pgm test3b.pgm \
> && echo Test images are OK.
>
>
>------------------------------------------------------------------------
>
>#include <stdlib.h>
>#include <stdio.h>
>
>#include <stdlib.h>
>
>// using "flat" matrices
>typedef struct {
> int rows;
> int cols;
>} MatrixSize;
>
>// low overhead version of std::vector
>// default assigment operator and copy constructor suffice in this example
>// since we do not need deep copy here and the destructor does nothing
>template <typename TYPE>
>class TVector {
> int size; // without size it is slower
> TYPE* values;
>public:
> TVector() : size(0), values(0) {}
> TVector(int size) : size(size)
> {
> // doesn't matter:
> values = new TYPE[size];
>// values = static_cast<double*> (malloc(sizeof(TYPE) * size));
> }
> ~TVector() { /*delete[] values;*/ }
> // no difference to const TYPE&
> TYPE operator[](int i) const {
> // doesn't matter:
>// return *(values + i);
> return values[i];
> }
> TYPE& operator[](int i) {
>// return *(values + i);
> return values[i];
> }
> void resize(int size) {
> this->size = size;
> if (values != 0) delete values;
> values = new TYPE[size];
> }
> void reference(TVector& other) {
> *this = other;
> }
>};
>typedef TVector<double> Matrix;
>
>Matrix newMatrix(MatrixSize size) {
> return Matrix(size.rows * size.cols);
>}
>
>#define POS(i,j) ((i) * matrix_size.cols + (j))
>
>void reaction_diffusion_texture(MatrixSize matrix_size,
> double d_a, double d_b, double s,
> Matrix& a, Matrix& b, Matrix& a_out, Matrix& b_out,
> Matrix& beta)
>{
> int i, j;
> for (i = 1; i < matrix_size.rows - 1; ++i) {
> for (j = 1; j < matrix_size.cols - 1; ++j) {
> a_out[POS(i,j)] = a[POS(i,j)]
> + s * (16 - a[POS(i,j)]*b[POS(i,j)])
> + d_a * ( a[POS(i+1,j)] + a[POS(i-1,j)] + a[POS(i,j)+1] + a[POS(i,j)-1]
> - 4 * a[POS(i,j)] );
> b_out[POS(i,j)] = b[POS(i,j)]
> + s * (a[POS(i,j)] * b[POS(i,j)] - b[POS(i,j)] - beta[POS(i,j)])
> + d_b * ( b[POS(i+1,j)] + b[POS(i-1,j)] + b[POS(i,j)+1] + b[POS(i,j)-1]
> - 4 * b[POS(i,j)] );
> if (a_out[POS(i,j)] < 0.0) a_out[POS(i,j)] = 0.0;
> if (b_out[POS(i,j)] < 0.0) b_out[POS(i,j)] = 0.0;
> }
> }
>}
>
>int main(int argc, char* argv[]) {
> double init_a = 4;
> double init_b = 4;
> double d_a = .125;
> double d_b = .03125;
> double s = .005;
> double mean_beta = 12;
> double beta_variation = .05;
> int beta_random_seed = 1;
> int i, j, k;
> int num_iterations = 30000;
>
> Matrix matrix_a;
> Matrix matrix_b;
> Matrix matrix_a_out;
> Matrix matrix_b_out;
> Matrix matrix_swap_tmp;
> Matrix matrix_beta;
>
> MatrixSize matrix_size = { 64+2, 64+2 };
>
> matrix_a = newMatrix(matrix_size);
> matrix_b = newMatrix(matrix_size);
> matrix_a_out = newMatrix(matrix_size);
> matrix_b_out = newMatrix(matrix_size);
> matrix_swap_tmp = newMatrix(matrix_size);
> matrix_beta = newMatrix(matrix_size);
>
> srand(beta_random_seed);
> for (i = 0; i < matrix_size.rows; ++i) {
> for (j = 0; j < matrix_size.cols; ++j) {
> matrix_a[POS(i,j)] = init_a;
> matrix_b[POS(i,j)] = init_b;
> matrix_beta[POS(i,j)] = mean_beta
> +2.0 * (((double)rand() / RAND_MAX) - 0.5) * beta_variation;
> }
> }
>
> for (k = 0; k < num_iterations; ++k) {
> reaction_diffusion_texture(matrix_size, d_a, d_b, s,
> matrix_a, matrix_b,
> matrix_a_out, matrix_b_out, matrix_beta);
> matrix_swap_tmp = matrix_a; matrix_a = matrix_a_out; matrix_a_out = matrix_swap_tmp;
> matrix_swap_tmp = matrix_b; matrix_b = matrix_b_out; matrix_b_out = matrix_swap_tmp;
> }
>
> printf("P2\n%d %d\n255\n", matrix_size.cols - 2, matrix_size.rows - 2);
> for (i = 1; i < matrix_size.rows - 1; ++i) {
> for (j = 1; j < matrix_size.cols - 1; ++j) {
> printf("%d ", (int) matrix_a[POS(i,j)]);
> }
> printf("\n");
> }
> return 0;
>}
>
>
>------------------------------------------------------------------------
>
>#include <stdlib.h>
>#include <stdio.h>
>
>#include <vector>
>
>typedef struct {
> int rows;
> int cols;
>} MatrixSize;
>
>typedef std::vector< std::vector<double> > Matrix;
>
>Matrix newMatrix(MatrixSize size) {
>// int i;
>// Matrix result(size.rows);
>// for (i = 0; i < size.rows; ++i) {
>// result[i].resize(size.cols);
>// }
>// return result;
> return Matrix(size.rows, std::vector<double>(size.cols));
>}
>
>void reaction_diffusion_texture(MatrixSize matrix_size,
> double d_a, double d_b, double s,
> Matrix& a, Matrix& b, Matrix& a_out, Matrix& b_out,
> Matrix& beta)
>{
> int i, j;
> for (i = 1; i < matrix_size.rows - 1; ++i) {
> for (j = 1; j < matrix_size.cols - 1; ++j) {
> a_out[i][j] = a[i][j]
> + s * (16 - a[i][j]*b[i][j])
> + d_a * ( a[i+1][j] + a[i-1][j] + a[i][j+1] + a[i][j-1]
> - 4 * a[i][j] );
> b_out[i][j] = b[i][j]
> + s * (a[i][j] * b[i][j] - b[i][j] - beta[i][j])
> + d_b * ( b[i+1][j] + b[i-1][j] + b[i][j+1] + b[i][j-1]
> - 4 * b[i][j] );
> if (a_out[i][j] < 0.0) a_out[i][j] = 0.0;
> if (b_out[i][j] < 0.0) b_out[i][j] = 0.0;
> }
> }
>}
>
>int main(int argc, char* argv[]) {
> double init_a = 4;
> double init_b = 4;
> double d_a = .125;
> double d_b = .03125;
> double s = .005;
> double mean_beta = 12;
> double beta_variation = .05;
> int beta_random_seed = 1;
> int i, j, k;
> int num_iterations = 30000;
>
> Matrix matrix_a;
> Matrix matrix_b;
> Matrix matrix_a_out;
> Matrix matrix_b_out;
> Matrix matrix_swap_tmp;
> Matrix matrix_beta;
>
> MatrixSize matrix_size = { 64+2, 64+2 };
>
> matrix_a = newMatrix(matrix_size);
> matrix_b = newMatrix(matrix_size);
> matrix_a_out = newMatrix(matrix_size);
> matrix_b_out = newMatrix(matrix_size);
> matrix_swap_tmp = newMatrix(matrix_size);
> matrix_beta = newMatrix(matrix_size);
>
> srand(beta_random_seed);
> for (i = 0; i < matrix_size.rows; ++i) {
> for (j = 0; j < matrix_size.cols; ++j) {
> matrix_a[i][j] = init_a;
> matrix_b[i][j] = init_b;
> matrix_beta[i][j] = mean_beta
> +2.0 * (((double)rand() / RAND_MAX) - 0.5) * beta_variation;
> }
> }
>
> for (k = 0; k < num_iterations; ++k) {
> reaction_diffusion_texture(matrix_size, d_a, d_b, s,
> matrix_a, matrix_b,
> matrix_a_out, matrix_b_out, matrix_beta);
> matrix_a.swap(matrix_a_out);
> matrix_b.swap(matrix_b_out);
> }
>
> printf("P2\n%d %d\n255\n", matrix_size.cols - 2, matrix_size.rows - 2);
> for (i = 1; i < matrix_size.rows - 1; ++i) {
> for (j = 1; j < matrix_size.cols - 1; ++j) {
> printf("%d ", (int) matrix_a[i][j]);
> }
> printf("\n");
> }
> return 0;
>}
>
>
>------------------------------------------------------------------------
>
>#include <stdlib.h>
>#include <stdio.h>
>
>#include <blitz/array.h>
>
>using namespace blitz;
>
>// using double instead of float saves about 11% in time
>typedef Array<double, 2> Matrix;
>typedef TinyVector<int, 2> MatrixSize;
>
>#define ROWS 0
>#define COLS 1
>
>// using references here costs time !
>void reaction_diffusion_texture(MatrixSize matrix_size,
> double d_a, double d_b, double s,
> Matrix a, Matrix b, Matrix a_out, Matrix b_out,
> Matrix beta)
>{
> int i, j;
> for (i = 1; i < matrix_size[ROWS] - 1; ++i) {
> for (j = 1; j < matrix_size[COLS] - 1; ++j) {
> a_out(i,j) = a(i,j)
> + s * (16 - a(i,j)*b(i,j))
> + d_a * ( a(i+1,j) + a(i-1,j) + a(i,j+1) + a(i,j-1)
> - 4 * a(i,j) );
> b_out(i,j) = b(i,j)
> + s * (a(i,j) * b(i,j) - b(i,j) - beta(i,j))
> + d_b * ( b(i+1,j) + b(i-1,j) + b(i,j+1) + b(i,j-1)
> - 4 * b(i,j) );
> if (a_out(i,j) < 0.0) a_out(i,j) = 0.0;
> if (b_out(i,j) < 0.0) b_out(i,j) = 0.0;
> }
> }
>}
>
>int main(int argc, char* argv[]) {
> double init_a = 4;
> double init_b = 4;
> double d_a = .125;
> double d_b = .03125;
> double s = .005;
> double mean_beta = 12;
> double beta_variation = .05;
> int beta_random_seed = 1;
> int i, j, k;
> int num_iterations = 30000;
>
> Matrix matrix_a;
> Matrix matrix_b;
> Matrix matrix_a_out;
> Matrix matrix_b_out;
> Matrix matrix_swap_tmp;
> Matrix matrix_beta;
>
> MatrixSize matrix_size = shape(64+2, 64+2);
>
> matrix_a.resize(matrix_size);
> matrix_b.resize(matrix_size);
> matrix_a_out.resize(matrix_size);
> matrix_b_out.resize(matrix_size);
> matrix_swap_tmp.resize(matrix_size);
> matrix_beta.resize(matrix_size);
>
> srand(beta_random_seed);
> for (i = 0; i < matrix_size[ROWS]; ++i) {
> for (j = 0; j < matrix_size[COLS]; ++j) {
> matrix_a(i,j) = init_a;
> matrix_b(i,j) = init_b;
> matrix_beta(i,j) = mean_beta
> +2.0 * (((double)rand() / RAND_MAX) - 0.5) * beta_variation;
> }
> }
>
> for (k = 0; k < num_iterations; ++k) {
> reaction_diffusion_texture(matrix_size, d_a, d_b, s,
> matrix_a, matrix_b,
> matrix_a_out, matrix_b_out, matrix_beta);
>// matrix_a.swap(matrix_a_out); // only defined for std::vector
>// matrix_b.swap(matrix_b_out); // not for blitz::Array :(
> matrix_swap_tmp.reference(matrix_a); matrix_a.reference(matrix_a_out); matrix_a_out.reference(matrix_swap_tmp);
> matrix_swap_tmp.reference(matrix_b); matrix_b.reference(matrix_b_out); matrix_b_out.reference(matrix_swap_tmp);
>
> // slower than above:
>// swap(matrix_a, matrix_a_out);
>// swap(matrix_b, matrix_b_out);
> // slower than swap:
>// matrix_swap_tmp = matrix_a; matrix_a = matrix_a_out; matrix_a_out = matrix_swap_tmp;
>// matrix_swap_tmp = matrix_b; matrix_b = matrix_b_out; matrix_b_out = matrix_swap_tmp;
> }
>
> printf("P2\n%d %d\n255\n", matrix_size[COLS] - 2, matrix_size[ROWS] - 2);
> for (i = 1; i < matrix_size[ROWS] - 1; ++i) {
> for (j = 1; j < matrix_size[COLS] - 1; ++j) {
> printf("%d ", (int) matrix_a(i,j));
> }
> printf("\n");
> }
>#ifdef BZ_DEBUG
> fprintf(stderr, "WARNING BZ_DEBUG IS ENABLED!\n");
>#endif
> return 0;
>}
>
>
>------------------------------------------------------------------------
>
>#include <stdlib.h>
>#include <stdio.h>
>
>struct MatrixSize {
> int rows;
> int cols;
> int operator[](int dim) { return *(&rows + dim); }
>};
>
>// low overhead version of blitz::Array
>// default assigment operator and copy constructor suffice in this example
>// since we do not need deep copy here and the destructor does nothing
>template <typename TYPE>
>class TMatrix {
> MatrixSize size;
> TYPE* values;
>public:
> TMatrix() : values(0) { size.rows = size.cols = 0; }
> TMatrix(MatrixSize size) : size(size),
> values(new TYPE[size.rows * size.cols]) {}
> ~TMatrix() { /*delete[] values;*/ }
> TYPE operator()(int i, int j) const { return values[size.cols * i + j]; }
> TYPE& operator()(int i, int j) { return values[size.cols * i + j]; }
> void resize(MatrixSize size) {
> this->size = size;
> if (values != 0) delete values;
> values = new TYPE[size.rows * size.cols];
> }
> void reference(TMatrix& other) {
> *this = other;
> }
>};
>typedef TMatrix<double> Matrix;
>
>#define ROWS 0
>#define COLS 1
>
>// But using references here saves time !!
>void reaction_diffusion_texture(MatrixSize matrix_size,
> double d_a, double d_b, double s,
> Matrix& a, Matrix& b, Matrix& a_out, Matrix& b_out,
> Matrix& beta)
>{
> int i, j;
> for (i = 1; i < matrix_size[ROWS] - 1; ++i) {
> for (j = 1; j < matrix_size[COLS] - 1; ++j) {
> a_out(i,j) = a(i,j)
> + s * (16 - a(i,j)*b(i,j))
> + d_a * ( a(i+1,j) + a(i-1,j) + a(i,j+1) + a(i,j-1)
> - 4 * a(i,j) );
> b_out(i,j) = b(i,j)
> + s * (a(i,j) * b(i,j) - b(i,j) - beta(i,j))
> + d_b * ( b(i+1,j) + b(i-1,j) + b(i,j+1) + b(i,j-1)
> - 4 * b(i,j) );
> if (a_out(i,j) < 0.0) a_out(i,j) = 0.0;
> if (b_out(i,j) < 0.0) b_out(i,j) = 0.0;
> }
> }
>}
>
>int main(int argc, char* argv[]) {
> double init_a = 4;
> double init_b = 4;
> double d_a = .125;
> double d_b = .03125;
> double s = .005;
> double mean_beta = 12;
> double beta_variation = .05;
> int beta_random_seed = 1;
> int i, j, k;
> int num_iterations = 30000;
>
> Matrix matrix_a;
> Matrix matrix_b;
> Matrix matrix_a_out;
> Matrix matrix_b_out;
> Matrix matrix_swap_tmp;
> Matrix matrix_beta;
>
> MatrixSize matrix_size = {64+2, 64+2}; //shape(64+2, 64+2);
>
> matrix_a.resize(matrix_size);
> matrix_b.resize(matrix_size);
> matrix_a_out.resize(matrix_size);
> matrix_b_out.resize(matrix_size);
> matrix_swap_tmp.resize(matrix_size);
> matrix_beta.resize(matrix_size);
>
> srand(beta_random_seed);
> for (i = 0; i < matrix_size[ROWS]; ++i) {
> for (j = 0; j < matrix_size[COLS]; ++j) {
> matrix_a(i,j) = init_a;
> matrix_b(i,j) = init_b;
> matrix_beta(i,j) = mean_beta
> +2.0 * (((double)rand() / RAND_MAX) - 0.5) * beta_variation;
> }
> }
>
> for (k = 0; k < num_iterations; ++k) {
> reaction_diffusion_texture(matrix_size, d_a, d_b, s,
> matrix_a, matrix_b,
> matrix_a_out, matrix_b_out, matrix_beta);
> matrix_swap_tmp.reference(matrix_a); matrix_a.reference(matrix_a_out); matrix_a_out.reference(matrix_swap_tmp);
> matrix_swap_tmp.reference(matrix_b); matrix_b.reference(matrix_b_out); matrix_b_out.reference(matrix_swap_tmp);
> // slower than above:
>// swap(matrix_a, matrix_a_out);
>// swap(matrix_b, matrix_b_out);
> // slower than swap:
>// matrix_swap_tmp = matrix_a; matrix_a = matrix_a_out; matrix_a_out = matrix_swap_tmp;
>// matrix_swap_tmp = matrix_b; matrix_b = matrix_b_out; matrix_b_out = matrix_swap_tmp;
> }
>
> printf("P2\n%d %d\n255\n", matrix_size[COLS] - 2, matrix_size[ROWS] - 2);
> for (i = 1; i < matrix_size[ROWS] - 1; ++i) {
> for (j = 1; j < matrix_size[COLS] - 1; ++j) {
> printf("%d ", (int) matrix_a(i,j));
> }
> printf("\n");
> }
>#ifdef BZ_DEBUG
> fprintf(stderr, "WARNING BZ_DEBUG IS ENABLED!\n");
>#endif
> return 0;
>}
>
>
>------------------------------------------------------------------------
>
>#include <stdlib.h>
>#include <stdio.h>
>
>#include <blitz/array.h>
>#include <blitz/funcs.h>
>
>using namespace blitz;
>
>typedef Array<double, 2> Matrix;
>typedef TinyVector<int, 2> MatrixSize;
>
>#define ROWS 0
>#define COLS 1
>
>BZ_DECLARE_STENCIL8(spots2D_stencil,beta,d_a,d_b,s,a,b,a_out,b_out)
> a_out = a + s * (16 - a*b)
> + d_a * /*(a(-1,0)+a(+1,0)+a(0,-1)+a(0,+1)-4*a(0,0))*/Laplacian2D(a);
> b_out = b + s * (a*b - b - beta)
> + d_b * /*(b(-1,0)+b(+1,0)+b(0,-1)+b(0,+1)-4*b(0,0))*/Laplacian2D(b);
> if (a_out < 0.0) a_out = 0.0;
> if (b_out < 0.0) b_out = 0.0;
>BZ_END_STENCIL_WITH_SHAPE(shape(-1,-1), shape(+1,+1)); // defining a shape saves about 2% in running time
>// slower:
>// BZ_DECLARE_STENCIL8(spots2D_stencil2,beta,d_a,d_b,s,a,b,a_out,b_out)
>// a_out = a + s * (16 - a*b)
>// + d_a * (a(-1,0)+a(+1,0)+a(0,-1)+a(0,+1)-4*a(0,0))/*Laplacian2D(a)*/;
>// b_out = b + s * (a*b - b - beta)
>// + d_b * (b(-1,0)+b(+1,0)+b(0,-1)+b(0,+1)-4*b(0,0))/*Laplacian2D(b)*/;
>// if (a_out < 0.0) a_out = 0.0;
>// if (b_out < 0.0) b_out = 0.0;
>// BZ_END_STENCIL_WITH_SHAPE(shape(-1,-1), shape(+1,+1)); // defining a shape saves about 2% in running time
>
>int main(int argc, char* argv[]) {
> double init_a = 4;
> double init_b = 4;
> double d_a = .125;
> double d_b = .03125;
> double s = .005;
> double mean_beta = 12;
> double beta_variation = .05;
> int beta_random_seed = 1;
> int i, j, k;
> int num_iterations = 30000;
>
> Matrix matrix_a;
> Matrix matrix_b;
> Matrix matrix_a_out;
> Matrix matrix_b_out;
> Matrix matrix_swap_tmp;
> Matrix matrix_beta;
>
> MatrixSize matrix_size = shape(64+2, 64+2);
>
> matrix_a.resize(matrix_size);
> matrix_b.resize(matrix_size);
> matrix_a_out.resize(matrix_size);
> matrix_b_out.resize(matrix_size);
> matrix_swap_tmp.resize(matrix_size);
> matrix_beta.resize(matrix_size);
>
> // stencils only accept matrices, so we need matrices for the constants
> Matrix matrix_d_a(matrix_size);
> Matrix matrix_d_b(matrix_size);
> Matrix matrix_s(matrix_size);
> matrix_d_a = d_a;
> matrix_d_b = d_b;
> matrix_s = s;
>
> srand(beta_random_seed);
> for (i = 0; i < matrix_size[ROWS]; ++i) {
> for (j = 0; j < matrix_size[COLS]; ++j) {
> matrix_a(i,j) = init_a;
> matrix_b(i,j) = init_b;
> matrix_beta(i,j) = mean_beta
> +2.0 * (((double)rand() / RAND_MAX) - 0.5) * beta_variation;
> }
> }
>
> for (k = 0; k < num_iterations; ++k) {
> applyStencil(spots2D_stencil(),matrix_beta,
> matrix_d_a,matrix_d_b,matrix_s,
> matrix_a,matrix_b,matrix_a_out,matrix_b_out);
> matrix_swap_tmp.reference(matrix_a); matrix_a.reference(matrix_a_out); matrix_a_out.reference(matrix_swap_tmp);
> matrix_swap_tmp.reference(matrix_b); matrix_b.reference(matrix_b_out); matrix_b_out.reference(matrix_swap_tmp);
> }
>
> printf("P2\n%d %d\n255\n", matrix_size[COLS] - 2, matrix_size[ROWS] - 2);
> for (i = 1; i < matrix_size[ROWS] - 1; ++i) {
> for (j = 1; j < matrix_size[COLS] - 1; ++j) {
> printf("%d ", (int) matrix_a(i,j));
> }
> printf("\n");
> }
>#ifdef BZ_DEBUG
> fprintf(stderr, "WARNING BZ_DEBUG IS ENABLED!\n");
>#endif
> return 0;
>}
>
>
>------------------------------------------------------------------------
>
>#include <stdlib.h>
>#include <stdio.h>
>
>#include <blitz/array.h>
>#include <blitz/funcs.h>
>
>using namespace blitz;
>
>typedef Array<double, 2> Matrix;
>typedef TinyVector<int, 2> MatrixSize;
>
>#define ROWS 0
>#define COLS 1
>
>BZ_DECLARE_STENCIL6(spots2D_stencil_a,beta,d_a,s,a,b,a_out)
> a_out = a + s * (16 - a*b)
> + d_a * /*(a(-1,0)+a(+1,0)+a(0,-1)+a(0,+1)-4*a(0,0))*/Laplacian2D(a);
> if (a_out < 0.0) a_out = 0.0;
>BZ_END_STENCIL_WITH_SHAPE(shape(-1,-1), shape(+1,+1)); // defining a shape saves about 2% in running time
>BZ_DECLARE_STENCIL6(spots2D_stencil_b,beta,d_b,s,a,b,b_out)
> b_out = b + s * (a*b - b - beta)
> + d_b * /*(b(-1,0)+b(+1,0)+b(0,-1)+b(0,+1)-4*b(0,0))*/Laplacian2D(b);
> if (b_out < 0.0) b_out = 0.0;
>BZ_END_STENCIL_WITH_SHAPE(shape(-1,-1), shape(+1,+1)); // defining a shape saves about 2% in running time
>int main(int argc, char* argv[]) {
> double init_a = 4;
> double init_b = 4;
> double d_a = .125;
> double d_b = .03125;
> double s = .005;
> double mean_beta = 12;
> double beta_variation = .05;
> int beta_random_seed = 1;
> int i, j, k;
> int num_iterations = 30000;
>
> Matrix matrix_a;
> Matrix matrix_b;
> Matrix matrix_a_out;
> Matrix matrix_b_out;
> Matrix matrix_swap_tmp;
> Matrix matrix_beta;
>
> MatrixSize matrix_size = shape(64+2, 64+2);
>
> matrix_a.resize(matrix_size);
> matrix_b.resize(matrix_size);
> matrix_a_out.resize(matrix_size);
> matrix_b_out.resize(matrix_size);
> matrix_swap_tmp.resize(matrix_size);
> matrix_beta.resize(matrix_size);
>
> // stencils only accept matrices, so we need matrices for the constants
> Matrix matrix_d_a(matrix_size);
> Matrix matrix_d_b(matrix_size);
> Matrix matrix_s(matrix_size);
> matrix_d_a = d_a;
> matrix_d_b = d_b;
> matrix_s = s;
>
> srand(beta_random_seed);
> for (i = 0; i < matrix_size[ROWS]; ++i) {
> for (j = 0; j < matrix_size[COLS]; ++j) {
> matrix_a(i,j) = init_a;
> matrix_b(i,j) = init_b;
> matrix_beta(i,j) = mean_beta
> +2.0 * (((double)rand() / RAND_MAX) - 0.5) * beta_variation;
> }
> }
>
> for (k = 0; k < num_iterations; ++k) {
> applyStencil(spots2D_stencil_a(),matrix_beta,
> matrix_d_a,matrix_s,
> matrix_a,matrix_b,matrix_a_out);
> applyStencil(spots2D_stencil_b(),matrix_beta,
> matrix_d_b,matrix_s,
> matrix_a,matrix_b,matrix_b_out);
> matrix_swap_tmp.reference(matrix_a); matrix_a.reference(matrix_a_out); matrix_a_out.reference(matrix_swap_tmp);
> matrix_swap_tmp.reference(matrix_b); matrix_b.reference(matrix_b_out); matrix_b_out.reference(matrix_swap_tmp);
> }
>
> printf("P2\n%d %d\n255\n", matrix_size[COLS] - 2, matrix_size[ROWS] - 2);
> for (i = 1; i < matrix_size[ROWS] - 1; ++i) {
> for (j = 1; j < matrix_size[COLS] - 1; ++j) {
> printf("%d ", (int) matrix_a(i,j));
> }
> printf("\n");
> }
>#ifdef BZ_DEBUG
> fprintf(stderr, "WARNING BZ_DEBUG IS ENABLED!\n");
>#endif
> return 0;
>}
>
>
>------------------------------------------------------------------------
>
>#include <stdlib.h>
>#include <stdio.h>
>
>#include <vector>
>
>// using "flat" matrices
>
>typedef struct {
> int rows;
> int cols;
>} MatrixSize;
>
>typedef std::vector<double> Matrix;
>
>Matrix newMatrix(MatrixSize size) {
> return Matrix(size.rows * size.cols);
>}
>
>#define POS(i,j) ((i) * matrix_size.cols + (j))
>
>void reaction_diffusion_texture(MatrixSize matrix_size,
> double d_a, double d_b, double s,
> Matrix& a, Matrix& b, Matrix& a_out, Matrix& b_out,
> Matrix& beta)
>{
> int i, j;
> for (i = 1; i < matrix_size.rows - 1; ++i) {
> for (j = 1; j < matrix_size.cols - 1; ++j) {
> a_out[POS(i,j)] = a[POS(i,j)]
> + s * (16 - a[POS(i,j)]*b[POS(i,j)])
> + d_a * ( a[POS(i+1,j)] + a[POS(i-1,j)] + a[POS(i,j)+1] + a[POS(i,j)-1]
> - 4 * a[POS(i,j)] );
> b_out[POS(i,j)] = b[POS(i,j)]
> + s * (a[POS(i,j)] * b[POS(i,j)] - b[POS(i,j)] - beta[POS(i,j)])
> + d_b * ( b[POS(i+1,j)] + b[POS(i-1,j)] + b[POS(i,j)+1] + b[POS(i,j)-1]
> - 4 * b[POS(i,j)] );
> if (a_out[POS(i,j)] < 0.0) a_out[POS(i,j)] = 0.0;
> if (b_out[POS(i,j)] < 0.0) b_out[POS(i,j)] = 0.0;
> }
> }
>}
>
>int main(int argc, char* argv[]) {
> double init_a = 4;
> double init_b = 4;
> double d_a = .125;
> double d_b = .03125;
> double s = .005;
> double mean_beta = 12;
> double beta_variation = .05;
> int beta_random_seed = 1;
> int i, j, k;
> int num_iterations = 30000;
>
> Matrix matrix_a;
> Matrix matrix_b;
> Matrix matrix_a_out;
> Matrix matrix_b_out;
> Matrix matrix_swap_tmp;
> Matrix matrix_beta;
>
> MatrixSize matrix_size = { 64+2, 64+2 };
>
> matrix_a = newMatrix(matrix_size);
> matrix_b = newMatrix(matrix_size);
> matrix_a_out = newMatrix(matrix_size);
> matrix_b_out = newMatrix(matrix_size);
> matrix_swap_tmp = newMatrix(matrix_size);
> matrix_beta = newMatrix(matrix_size);
>
> srand(beta_random_seed);
> for (i = 0; i < matrix_size.rows; ++i) {
> for (j = 0; j < matrix_size.cols; ++j) {
> matrix_a[POS(i,j)] = init_a;
> matrix_b[POS(i,j)] = init_b;
> matrix_beta[POS(i,j)] = mean_beta
> +2.0 * (((double)rand() / RAND_MAX) - 0.5) * beta_variation;
> }
> }
>
> for (k = 0; k < num_iterations; ++k) {
> reaction_diffusion_texture(matrix_size, d_a, d_b, s,
> matrix_a, matrix_b,
> matrix_a_out, matrix_b_out, matrix_beta);
> matrix_a.swap(matrix_a_out); //!!
> matrix_b.swap(matrix_b_out); //!!
> }
>
> printf("P2\n%d %d\n255\n", matrix_size.cols - 2, matrix_size.rows - 2);
> for (i = 1; i < matrix_size.rows - 1; ++i) {
> for (j = 1; j < matrix_size.cols - 1; ++j) {
> printf("%d ", (int) matrix_a[POS(i,j)]);
> }
> printf("\n");
> }
> return 0;
>}
>
>
>------------------------------------------------------------------------
>
>#include <stdlib.h>
>#include <stdio.h>
>
>typedef struct {
> int rows;
> int cols;
>} MatrixSize;
>
>typedef double** Matrix;
>
>Matrix newMatrix(MatrixSize size) {
> int i;
> Matrix result = (double**) malloc(size.rows * sizeof(double*));
> for (i = 0; i < size.rows; ++i) {
> result[i] = (double*) malloc(size.cols * sizeof(double));
> }
> return result;
>}
>
>void reaction_diffusion_texture(MatrixSize matrix_size,
> double d_a, double d_b, double s,
> Matrix a, Matrix b, Matrix a_out, Matrix b_out,
> Matrix beta)
>{
> int i, j;
> for (i = 1; i < matrix_size.rows - 1; ++i) {
> for (j = 1; j < matrix_size.cols - 1; ++j) {
> a_out[i][j] = a[i][j]
> + s * (16 - a[i][j]*b[i][j])
> + d_a * ( a[i+1][j] + a[i-1][j] + a[i][j+1] + a[i][j-1]
> - 4 * a[i][j] );
> b_out[i][j] = b[i][j]
> + s * (a[i][j] * b[i][j] - b[i][j] - beta[i][j])
> + d_b * ( b[i+1][j] + b[i-1][j] + b[i][j+1] + b[i][j-1]
> - 4 * b[i][j] );
> if (a_out[i][j] < 0.0) a_out[i][j] = 0.0;
> if (b_out[i][j] < 0.0) b_out[i][j] = 0.0;
> }
> }
>}
>
>int main(int argc, char* argv[]) {
> double init_a = 4;
> double init_b = 4;
> double d_a = .125;
> double d_b = .03125;
> double s = .005;
> double mean_beta = 12;
> double beta_variation = .05;
> int beta_random_seed = 1;
> int i, j, k;
> int num_iterations = 30000;
>
> Matrix matrix_a;
> Matrix matrix_b;
> Matrix matrix_a_out;
> Matrix matrix_b_out;
> Matrix matrix_swap_tmp;
> Matrix matrix_beta;
>
> MatrixSize matrix_size = { 64+2, 64+2 };
>
> matrix_a = newMatrix(matrix_size);
> matrix_b = newMatrix(matrix_size);
> matrix_a_out = newMatrix(matrix_size);
> matrix_b_out = newMatrix(matrix_size);
> matrix_swap_tmp = newMatrix(matrix_size);
> matrix_beta = newMatrix(matrix_size);
>
> srand(beta_random_seed);
> for (i = 0; i < matrix_size.rows; ++i) {
> for (j = 0; j < matrix_size.cols; ++j) {
> matrix_a[i][j] = init_a;
> matrix_b[i][j] = init_b;
> matrix_beta[i][j] = mean_beta
> +2.0 * (((double)rand() / RAND_MAX) - 0.5) * beta_variation;
> }
> }
>
> for (k = 0; k < num_iterations; ++k) {
> reaction_diffusion_texture(matrix_size, d_a, d_b, s,
> matrix_a, matrix_b,
> matrix_a_out, matrix_b_out, matrix_beta);
> matrix_swap_tmp = matrix_a; matrix_a = matrix_a_out; matrix_a_out = matrix_swap_tmp;
> matrix_swap_tmp = matrix_b; matrix_b = matrix_b_out; matrix_b_out = matrix_swap_tmp;
> }
>
> printf("P2\n%d %d\n255\n", matrix_size.cols - 2, matrix_size.rows - 2);
> for (i = 1; i < matrix_size.rows - 1; ++i) {
> for (j = 1; j < matrix_size.cols - 1; ++j) {
> printf("%d ", (int) matrix_a[i][j]);
> }
> printf("\n");
> }
> return 0;
>}
>
>
>------------------------------------------------------------------------
>
>#include <stdlib.h>
>#include <stdio.h>
>
>// using "flat" matrices
>
>typedef struct {
> int rows;
> int cols;
>} MatrixSize;
>
>typedef double* Matrix;
>
>Matrix newMatrix(MatrixSize size) {
> Matrix result = (double*) malloc(size.rows * size.cols * sizeof(double));
> return result;
>}
>
>#define POS(i,j) ((i) * matrix_size.cols + (j))
>
>void reaction_diffusion_texture(MatrixSize matrix_size,
> double d_a, double d_b, double s,
> Matrix a, Matrix b, Matrix a_out, Matrix b_out,
> Matrix beta)
>{
> int i, j;
> for (i = 1; i < matrix_size.rows - 1; ++i) {
> for (j = 1; j < matrix_size.cols - 1; ++j) {
> a_out[POS(i,j)] = a[POS(i,j)]
> + s * (16 - a[POS(i,j)]*b[POS(i,j)])
> + d_a * ( a[POS(i+1,j)] + a[POS(i-1,j)] + a[POS(i,j)+1] + a[POS(i,j)-1]
> - 4 * a[POS(i,j)] );
> b_out[POS(i,j)] = b[POS(i,j)]
> + s * (a[POS(i,j)] * b[POS(i,j)] - b[POS(i,j)] - beta[POS(i,j)])
> + d_b * ( b[POS(i+1,j)] + b[POS(i-1,j)] + b[POS(i,j)+1] + b[POS(i,j)-1]
> - 4 * b[POS(i,j)] );
> if (a_out[POS(i,j)] < 0.0) a_out[POS(i,j)] = 0.0;
> if (b_out[POS(i,j)] < 0.0) b_out[POS(i,j)] = 0.0;
> }
> }
>}
>
>int main(int argc, char* argv[]) {
> double init_a = 4;
> double init_b = 4;
> double d_a = .125;
> double d_b = .03125;
> double s = .005;
> double mean_beta = 12;
> double beta_variation = .05;
> int beta_random_seed = 1;
> int i, j, k;
> int num_iterations = 30000;
>
> Matrix matrix_a;
> Matrix matrix_b;
> Matrix matrix_a_out;
> Matrix matrix_b_out;
> Matrix matrix_swap_tmp;
> Matrix matrix_beta;
>
> MatrixSize matrix_size = { 64+2, 64+2 };
>
> matrix_a = newMatrix(matrix_size);
> matrix_b = newMatrix(matrix_size);
> matrix_a_out = newMatrix(matrix_size);
> matrix_b_out = newMatrix(matrix_size);
> matrix_swap_tmp = newMatrix(matrix_size);
> matrix_beta = newMatrix(matrix_size);
>
> srand(beta_random_seed);
> for (i = 0; i < matrix_size.rows; ++i) {
> for (j = 0; j < matrix_size.cols; ++j) {
> matrix_a[POS(i,j)] = init_a;
> matrix_b[POS(i,j)] = init_b;
> matrix_beta[POS(i,j)] = mean_beta
> +2.0 * (((double)rand() / RAND_MAX) - 0.5) * beta_variation;
> }
> }
>
> for (k = 0; k < num_iterations; ++k) {
> reaction_diffusion_texture(matrix_size, d_a, d_b, s,
> matrix_a, matrix_b,
> matrix_a_out, matrix_b_out, matrix_beta);
> matrix_swap_tmp = matrix_a; matrix_a = matrix_a_out; matrix_a_out = matrix_swap_tmp;
> matrix_swap_tmp = matrix_b; matrix_b = matrix_b_out; matrix_b_out = matrix_swap_tmp;
> }
>
> printf("P2\n%d %d\n255\n", matrix_size.cols - 2, matrix_size.rows - 2);
> for (i = 1; i < matrix_size.rows - 1; ++i) {
> for (j = 1; j < matrix_size.cols - 1; ++j) {
> printf("%d ", (int) matrix_a[POS(i,j)]);
> }
> printf("\n");
> }
> return 0;
>}
>
>
>------------------------------------------------------------------------
>
>time ./myprogram1 > test1.pgm
>
>real 0m2.540s
>user 0m2.508s
>sys 0m0.001s
>time ./myprogram1b > test1b.pgm
>
>real 0m2.653s
>user 0m2.642s
>sys 0m0.001s
>time ./myprogram1bb > test1bb.pgm
>
>real 0m4.400s
>user 0m4.370s
>sys 0m0.000s
>time ./myprogram1bbb > test1bbb.pgm
>
>real 0m4.151s
>user 0m4.135s
>sys 0m0.001s
>time ./myprogram1c > test1c.pgm
>
>real 0m3.783s
>user 0m3.757s
>sys 0m0.002s
>time ./myprogram2 > test2.pgm
>
>real 0m6.148s
>user 0m6.115s
>sys 0m0.000s
>time ./myprogram2b > test2b.pgm
>
>real 0m4.855s
>user 0m4.833s
>sys 0m0.001s
>time ./myprogram3 > test3.pgm
>
>real 0m5.387s
>user 0m5.355s
>sys 0m0.003s
>time ./myprogram3b > test3b.pgm
>
>real 0m4.388s
>user 0m4.268s
>sys 0m0.002s
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Blitz-support mailing list
>Blitz-support_at_[hidden]
>http://www.oonumerics.org/mailman/listinfo.cgi/blitz-support
>

-- 
Dr. Julian C. Cummings                       E-mail: cummings_at_[hidden]
California Institute of Technology           Phone:  626-395-2543
1200 E. California Blvd., Mail Code 158-79   Fax:    626-584-5917
Pasadena, CA 91125