![]() |
Blitz Devel : |
From: Walter Landry (landry_at_[hidden])
Date: 1998-07-14 19:02:45
Boris wrote:
I just had to rewrite this expression:
>>
>> Copy_Index ii('i'), jj('j');
>> Index i('i'), j('j');
>>
>> K_new(ii,jj)<Lapse*(R(i,j) + Trace_K*K(i,j) - (2*K(i,k)&K_mix(k,j))
>> - 0.5*g(i,j)*(density - (g_up(k,l)|S(k,l))) - S(i,j));
tensor K_new = Lapse*R + Trace_K*K - (2.*K("ik")*K_mix("kj") -
0.5 * g("ij") * (density - g_up("kl")*S("kl")) - S;
or with indices all the way (not necessary since there is no implicit
transositions, and for example Lapse*R("ij") is the same as Lapse*R,
and K_new will get new indices next times it is used in contractions:
tensor K_new = Lapse*R("ij") + Trace_K*K("ij") - (2.*K("ik")*K_mix("kj") -
0.5 * g("ij") * (density - g_up("kl")*S("kl")) - S("ij");
I think that indices should ALWAYS be required. I don't consider it a
hassle to put them in, and it makes the code much more readable. I
think that forcing people to write good code is a good thing, as long
as you don't lose any power in the process. Also, looking at your
example, you have the statement
2.*K("ik")*K_mix("kj")
whereas I have
2*K(i,k)&K_mix(k,j)
In this example, K(i,k) is a symmetric rank 2 tensor, and K_mix(k,j)
is not symmetric. I use the "&" symbol because I want to produce a
rank 2 symmetric matrix out of it. This only requires computation of
6 elements. I have a feeling that your routines compute all 9. There
has to be a way to tell the compiler that you only want 6 elements, or
efficiency will suffer *greatly*. I don't know how to do this except
by using a different symbol for the contraction (which is what I do).
For example:
tensor a = b("ijk") * c("klz") * d("z");
a temorary will be created from c("klz") * d("z"), let's call it
temp("kl") and then used in contraction with b("ijk") wich will create
tensor a.
Actually, I don't think a temp should be created. I think it should
be inlined. My tensor package creates temps, though, because that was
the only way I could think of doing it at the time. I'll discuss it
more later.
> If I wanted to contract an index internally, I would write
>
> Contraction_Index q('q);
> S = T(q,q);
>
Or contract with Kronecker delta.
But using a Kronecker delta seems as awkward as having to use a sum()
function.
> K_new(ii,jj)<Lapse*(R(i,j) + Trace_K*K(i,j) - (2*K(i,k)&K_mix(k,j))
> - 0.5*g(i,j)*(density - (g_up(k,l)|S(k,l))) - S(i,j));
>
In the above example I see only single tensorial multiplications, that
is there is no tensor product with three tensors like in:
tensor A = B("qwer") * c("erty") * D("tyui");
or make it in Blitz notation:
tensor A = B(q,,w,e,r) * c(e,r,t,y) * D(t,y,u,i);
I'm not exactly sure what you're driving at, but I do have some
examples that do this double multiplication
lX(ii,jj) < (g_up(i,k)%dX(j,k) || g_up(j,k)%dX(i,k))
- (2/3.0)*dX(q,q)*g_up(i,j)
+ (g_up(i,k)%(christof(j,k,l)%X(l)) || g_up(j,k)%(christof(i,k,l)%X(l)));
The "||" symbol adds two non_symmetric rank 2 tensors to make a
symmetric rank 2 tensor. Once again, it only does 6 instead of 9
adds.
> On a related subject, it seems like the TinyVec routines are the ideal
> way to implement Tensors. Since Tensors typically have only a few
> members, it would make sense to completely unroll the loops. With the
> expression
>
> K_new(ii,jj) < Lapse*(K(i,k)&K_mix(k,j));
>
> I would want it unrolled at least into
>
> for(i=1;i<=3;i++)
> {
> for(j=1;j<=3;j++)
> {
> K_new(i,j) = Lapse*(K(i,1)*K_mix(1,j) + K(i,2)*K_mix(2,j)
> + K(i,3)*K_mix(3,j));
> }
> }
>
> I use tensors defined on large 3D arrays, so I would want them further
> expanded into a 3D loop, but that seems to be already in place in Blitz.
>
> It does seem like there are times when you don't want things
> completely unrolled. For example, the complete contraction in
>
> K_new(ii,jj) < Lapse*(g(i,j)*(g(k,l)|S(k,l)));
>
> should turn into
>
> temp=g(1,1)*S(1,1) + g(1,2)*S(1,2) + g(1,3)*S(1,3) + g(2,1)*S(2,1) +
> g(2,2)*S(2,2) + g(2,3)*S(2,3) + g(3,1)*S(3,1) + g(3,2)*S(3,2) +
> g(3,3)*S(3,3);
>
> for(i=1;i<=3;i++)
> {
> for(j=1;j<=3;j++)
> {
> K_new(i,j) = Lapse*g(i,j)*temp;
> }
> }
>
> It is not all that clear to me how to formulate rules as to when to
> use temporaries and when to use just unroll. It also depends on
> whether we want to optimize memory of speed. It seems to be worth it
> to figure it out.
>
Food for thoughts!
Actually, I think I've come up with a rule for when to unroll and when
to use temps. If you have an expression
A(...) = B(...)*(C(...)*D(...))
the question is whether C*D should be unrolled inline or a temp be
made. I think the crucial question is whether C*D changes as you
cycle through the indices of A. If it doesn't change, then it is
obvious that recomputing C*D for every index of A is inefficient. If
it only changes for one of the indices, then it is better to unroll.
The simple way to check for this is to see if the result C*D has any
of the indices of A. So in Boris' previous example
tensor a(i,j,l) = b(i,j,k) * c(k,l,z) * d(z);
this algorithm would create the expression
a(i,j,l) = b(i,j,1)*(c(1,l,1)*d(1) + c(1,l,2)*d(2) + c(1,l,3)*d(3))
+ b(i,j,2)*(c(2,l,1)*d(1) + c(2,l,2)*d(2) + c(2,l,3)*d(3))
+ b(i,j,3)*(c(3,l,1)*d(1) + c(3,l,2)*d(2) + c(3,l,3)*d(3))
It would also create what I feel are the "correct" optimizations of
the previous expressions. This really cuts down on the number of
temporaries required (a big problem with my package, and I think
Boris' as well). The only difficulty would be the incredibly large
expressions that might be created. I think Todd wrote somewhere about
the Instruction Cache being only so big. I don't know if we have to
worry about that.
I think (I like it) that contraction should be implicit, and it should
contract or expand tensor of any kind or order (up to some predefined
order). In nDarray set o classes operator*() handles all of that and
it's really not that complex at all.
Well, like I said, I don't think that having a single "*" operator (or
even a single "+" or "-" operator) will be able to make all of the
optimizations required. Sometimes you have to explicitly tell the
computer that you want a symmetric tensor in the end. Contractions
should definitely be implicit, though.
More to ponder,
Walter Landry
landry_at_[hidden]
--------------------- blitz-dev list --------------------------------
* To subscribe/unsubscribe: mail to majordomo_at_[hidden], with
"subscribe blitz-dev" or "unsubscribe blitz-dev" in the body of the message
* Blitz++ web page: http://oonumerics.org/blitz/