Currently in Blitz++ stencils are implemented using subarray syntax.
For example:
Range I(1,N-2), J(1,N-2), K(1,N-2);
P3(I,J,K) = (2-6*c(I,J,K)) * P2(I,J,K)
+ c(I,J,K)*(P2(I-1,J,K) + P2(I+1,J,K) + P2(I,J-1,K) + P2(I,J+1,K)
+ P2(I,J,K-1) + P2(I,J,K+1)) - P1(I,J,K);
Problems with this:
- expression templates cause long compile times
- hard to determine spatial extent of the stencil
- get a separate pointer for each term, rather than one for each
array (Dan Quinlan's problem)
- syntax is slightly klunky
Here is a possible solution based on the "stencil objects" idea,
which is originally due to the POOMA people.
Users create an object which encapsulates the stencil. For example:
class acoustic3D_stencil {
public:
template<class T1, class T2, class T3, class T4>
void apply(T1& P1, T2& P2, T3& P3, T4& c) const
{
P3(0,0,0) = (2-6*c(0,0,0)) * P2(0,0,0) + c(0,0,0)*(P2(-1,0,0)
+ P2(+1,0,0) + P2(0,-1,0) + P2(0,+1,0) + P2(0,0,-1)
+ P2(0,0,+1)) - P1(0,0,0);
}
};
The rationale for all the template parameters is that
you can do tricky things (like passing in a special object which
records the spatial extent of the stencil).
This syntax is a bit klunky, so a macro could be used:
DECLARE_STENCIL4(acoustic3D_stencil,P1,P2,P3,c)
P3(0,0,0) = (2-6*c(0,0,0)) * P2(0,0,0) + c(0,0,0)*(P2(-1,0,0)
+ P2(+1,0,0) + P2(0,-1,0) + P2(0,+1,0) + P2(0,0,-1)
+ P2(0,0,+1)) - P1(0,0,0);
END_STENCIL
Even better, this design allows stencils to be composed
from other stencils. For example, with the aid of this
definition (in a blitz library header somewhere):
template<class T>
inline typename T::T_numtype Laplacian3D(T& A)
{
return -6*A(0,0,0) + A(-1,0,0) + A(1,0,0) + A(0,-1,0) + A(0,1,0)
+ A(0,0,-1) + A(0,0,1);
}
The stencil declaration can be written like this:
DECLARE_STENCIL4(acoustic3D_stencil, P1, P2, P3, c)
P3 = 2 * P2 + c * Laplacian3D(P2) - P1;
END_STENCIL
Which is as nice as you could hope for.
This requires that the P1, P2, P3, c objects have appropriate
conversion operators (I've tried all this and it works).
Inside the stencil, you can have multiple assigns
and if/elseif/else blocks:
DECLARE_STENCIL4(someWeirdStencil, A, B, C, D)
// Completely nonsensical operations, just to illustrate
// what you can do
A = 2 * B + C * Laplacian3D(D);
C = C * 0.95;
if (D < 0.0)
B = cross(grad(C),grad(D));
else
B *= sin(D);
END_STENCIL
Now to actually use the stencil, users would invoke some
function, such as:
applyStencil(acoustic3D_stencil, P1, P2, P3, c);
Or, if you wanted a nicer syntax you could probably
swing something like:
acoustic3D_stencil(P1,P2,P3,c);
By overloading operator().
Inside applyStencil, the following things would happen:
o Invoke the stencil once on a set of objects which record
the spatial extent of the stencil
o Use this info to determine the subdomain over which the
stencil is applied
o Apply the stencil, using tiling and other tricks to get
good performance
I've implemented this, and it generates optimal code with
KAI C++.
Some notes:
1. No expression templates: fast compile times.
2. Stencils can be composed together: Laplacian, Grad, higher-order
derivatives, etc. can be put in a library header
3. Knowing spatial extent of the stencil helps with communication
patterns.
4. Simple boundary conditions can be applied, such as periodic BCs,
and zero-padding
5. Takes care of loop jamming and WHERE/ELSEWHERE/ENDWHERE
6. Can provide special "applyStencil" routines, such as applyRed(..)
and applyBlack(..)
7. Solves the Quinlan problem
8. Don't have to implement array expression versions of all possible
scalar operators. For example, cross(A,B) for vector fields.
(Although arguably these are nice to have)
9. Users can do arbitrarily complicated things inside a stencil,
including calling their own functions, iterating, solving small
linear systems, etc.
10. Straightforward(?) to implement threaded versions of applyStencil.
Comments welcomed,
Todd
--------------------- blitz-dev list --------------------------------
* To subscribe/unsubscribe: mail to majordomo@oonumerics.org, with
"subscribe blitz-dev" or "unsubscribe blitz-dev" in the body of the message
* Blitz++ web page: http://oonumerics.org/blitz/
This archive was generated by hypermail 2b29 : Wed Feb 20 2002 - 04:30:05 EST