I'll look at these slightly out of order because the contraction question
raises some of the most interesting things to me.
At 05:39 PM 6/12/98 -0600, Todd L. Veldhuizen wrote:
>1. What is a reasonable upper bound on rank for a tensor?
Well, there is one certainty in life: as soon as you put a limit on it,
someone will want then next one. The highest rank I've seen physically was
4th, but as we see below, multiplying tensors together will generate
intermediate tensors of higher rank.
>2. Many kinds of tensors
My guess is that an engine or policy template parameter will work best for
this.
>3. Should tensor indices be associated with a particular tensor?
A lot of tensor mechanics cares deeply about what you contract with what.
Having some way to state that I think is key, and using tensor indexes is a
convenient notation. I don't think it is required, but it can be done, see
below.
>5. Expression templates?
Unnecessary for this, I think, for the reasons Todd states.
>4. How to handle contractions
>Array<float,2> r = sum(A(i,k) * B(j,k), k);
>tensor tst1 = t2("ij")*t4("ijkl")*t4("klpq")*t2("pq");
This is a little bit tricky. The notation above has lots of advantages
since it mimics a tensor notation which is very efficient for representing
what you want. Personally, I consider the notation with the explicit
summation to be a very minor tweak on the notation without it. Writing the
last line as:
sum(t2("ij")*sum(t4("ijkl")*sum(t4("klpq")*t2("pq"),"pq"),"kl"),"ij" )
is just fine with me. The types of objects we're considering here are
intended to be small and have no overhead at run time, so string parsing
won't work, but being able to specify the names i and j and k and so on
yourself rather than have them be fixed in some way would be nice. If I
want to use mu and nu and so on for my subscripts, I should be able to do
that.
Let's break down what we mean by a statement like the first one above a
bit. Multiplying those two rank two tensors would have to be a rank 4
tensor. That is, it would be essentially:
C(i,j,k,l) = A(i,j)*B(k,l);
Then we want to contract that on j and l:
D(i,k) = sum( C(i,j,k,l)*delta(j,l) , j, l );
which we might want to write as:
D(i,k) = contract( C(i,j,k,l) , j,l );
since contractions would be common. Inlining turns this into:
D(i,k) = contract( A(i,j)*B(k,l) , j,l );
Now, if we were to drop the indexes, we could write this like:
D = contract( A*B , 2 , 4 );
to indicate contraction on the second and fourth indexes. We would
probably want to make those compile time parameters with something like:
D = contract<2,4>(A*B);
The rule here is that operator* is an outer product. With this scheme, the
complex expression above would look like:
contract<1,2>(t2*contract<3,5,4,6>(t4*contract<3,5,4,6>(t4*t2))
where the contract with four template args contracts 3 and 5 together and 4
and 6 together. The highest rank tensor generated here is 6. If we were
to put all the contractions on the outside, the highest would be 12. You
would still get the same answer but slower on the compile and probably on
run-time unless you have a godlike CSE engine.
This is certainly general enough to represent all the same permutations as
the original notation, but it is just as certainly less elegant. Is there
a way to recover the previous elegance?
Suppose you have the distinct types I J K L. Suppose you have objects of
each type i j k l. Then the statement:
D(i,k) = contract<J,L>(A(i,j)*B(k,l))
has much the same information as the contraction above except that there is
no real need for I J K L to be ordered like the integers were. If you can
do that then you can do:
D(i,j) = sum(A(i,k)*B(j,k),k);
where the type of K is obtained from the argument k instead of using an
explicit template argument. We're back to a notation we like, but a
miracle has occurred because it isn't yet clear what the result of
A(i,k)*B(j,k) is, or even what A(i,k) is.
Let's start with A(i,k)*B(j,k). I think it has to be a rank 4 tensor with
a particular set of indexes. In the same way A(i,k) has to be a rank two
tensor with particular indexes. If we want the ranks of the tensor to be
identified with particular types, those types have to be carried along as
part of the *type* of the tensor. Since compile-time decisions are made
using the subscript, the information needs to be compile time.
This could be done with the engine pattern. Suppose the rank 2 tensor type
is something like
Tensor2D< dim , dim , double , Full >
meaning that it is rank two, it is Dim by Dim in size, uses double
precision and it uses full storage. Subscripting that with (i,k) could
produce a tensor of type:
Tensor2D< dim , dim , double , Indexed<Full,I,K> >
Here the engine tag means that it is an indexed tensor with full storage
and the subscripts are I and K. Or perhaps it would be a type that can
scale to arbitrary rank like
Tensor< Dims<dim,dim> , double , Indexed<Full,I,K> >
The type Dims would encode the rank and the size of each dimension.
Combining that information with the engine tag would give an engine of that
rank. This is a situation where I like having an engine tag more than the
actual engine as the template parameter: the engine tag remains
comparitively invariant was we change the other template parameters to
Tensor. Multiplying A(i,k)*B(j,k) together could give:
Tensor< Dims<dim,dim,dim,dim> , double , Indexed<Full,I,K,J,K> >
Then the sum function would have enough information to, in theory, do the
sum over K. With enough template metaprogram wizardry I'm sure it could be
done. The complex expression would then look like:
sum(t2(i,j)*sum(t4(i,j,k,l)*sum(t4(k,l,p,q)*t2(p,q),p,q),k,l),i,j)
Let's look at how many terms are involved with this. Since the sums are
pushed into the expression as far as possible, the worst case is:
sum(t4(k,l,p,q)*t2(p,q),p,q)
Suppose dim==4 for each subscript and we have full storage. t4*t2 is then
rank 6, for 4096 terms. That gets summed down to a rank 2 tensor with 16
terms, and 256 terms get summed into each one of those. Somewhere, a
compiler is screaming...
Suppose both t4 and t2 are symmetric under all permutations. Then the
product of those, call it t6, is symmetric under permutations of the first
four or the last two, but you can't permute between them. This shows a
need for general permutation patterns in tensors. The number of distinct
elements in that rank 6 tensor is the product of the ranks of the symmetric
rank 4 tensor (35) times the number in the symmetric rank 2 tensor (10) for
350 distinct terms which get summed up into the 10 terms of the result.
That is still pretty hairy, but not as bad.
This could be refined in various ways, but it looks feasible to me.
-Steve Karmesin
======================================================================
Pooma Team Leader
karmesin@lanl.gov
phone: (505)665-6019
fax : (505)667-4939
For PGP public key, send message with subject line: SEND PUBLIC KEY
======================================================================
--------------------- blitz-dev list --------------------------------
* To subscribe/unsubscribe: mail to majordomo@oonumerics.org, with
"subscribe blitz-dev" or "unsubscribe blitz-dev" in the body of the message
* Blitz++ web page: http://oonumerics.org/blitz/
This archive was generated by hypermail 2b29 : Wed Feb 20 2002 - 04:30:04 EST