Re: OON: f90 vrs CC

From: Nikolai A. Bannov (bannov@c11177-330dan.ece.ncsu.edu)
Date: Wed Mar 19 1997 - 18:10:40 EST


:-) From ferrell@alpheratz.mit.edu Wed Mar 19 16:45 EST 1997
:-) Date: Wed, 19 Mar 1997 16:37:59 -0500
:-) To: bannov@c11177-330dan.ece.ncsu.edu
:-) Cc: oon-list@oonumerics.org
:-) Subject: Re: OON: f90 vrs CC
:-) From: ferrell@cpca.com
:-)
:-)
:-) How do you handle the case where the user writes:
:-) A = A*C?
:-)
:-) This is a perfectly valid invocation, and well defined in F90
:-) (behavior should be as if RHS gets fully evaluated before assignment
:-) is done). Using the function you wrote would give incorrect results
:-) because A would get overwritten before the RHS was fully evaluated.
:-)
:-) I have come across this problem frequently in providing library
:-) routines. I would like to optimize for the most common case: LHS does
:-) not overlap any of the arguments. However, I find it tricky to do
:-) that optimization and still provide correct results in the event user
:-) to writes statements such as A=A*C.
:-)
:-) -robert
:-)
:-) --
:-) ========================================================================
:-) Robert C. Ferrell, PhD E-mail: ferrell@cpca.com
:-) Massachusetts Institute of Technology MIT Office: (617) 253-3961
:-) Cambridge Power Computing Associates, Ltd. CPCA Office:(617) 734-8569
:-) ========================================================================
:-)

I do not write libraries, but
I think the problem is difficult. I have a simple solution for the case
A=A*B and A=B*A, but not for the both cases.

C does not have to store arrays of more than 1 dimension
in sequential memory, fortran does.

I think the fastest solution is to write f77 subroutine which
passes adresses of all A, B, and C to C-function, which analyses
physical adresses and compares with dimensions of arrays, and return
an answer to the question if there is any overlap in the memory,
the next step would be a branching. But it will be procedural, not OO.

Why allocating memory is not acceptable? In general case you have to
do N^3 arithmetic operation, what is much more than N^2 assignments,
it should not deteriorate performance, it takes memory of course,
but thouse who multiply 10^6 x 10^6 matrixes can check yourself
if they overlap, for 10^3 x 10^3 it is not much memory.

I've attached f90 routines `mult' and `mult_for_dummy' which works.
`mult' has side - effects, so strictly it is not standard compatible,
however it will work if no overlap.
`mult_for_dummy' does not have this restriction.

Nik

module example

contains

        function mult(B,C)
          real, dimension(:,:), intent(in) :: B, C
          real, dimension(size(B,1),size(B,1)) :: mult
          integer :: N
          integer :: K,J
             N = size(B,1)
             mult=0. ; do J=1,N ; do K=1,N
             mult(1:N,J)=mult(1:N,J)+B(1:N,K)*C(K,J) ; enddo ; enddo
        end function mult

        function mult_for_dummy(B,C)
          real, dimension(:,:), intent(in) :: B, C
          real, dimension(size(B,1),size(B,1)) :: mult_for_dummy
          real, dimension(:,:), allocatable :: tmp
          integer :: N
          integer :: K,J
             N = size(B,1) ; allocate(tmp(1:N,1:N))
             tmp=0. ; do J=1,N ; do K=1,N
             tmp(1:N,J)=tmp(1:N,J)+B(1:N,K)*C(K,J) ; enddo ; enddo
             mult_for_dummy=tmp
             deallocate(tmp)
        end function mult_for_dummy

end module example

subroutine sample91

use example

implicit NONE

   integer, parameter :: N=2

   real, dimension(N,N) :: A,B,C

   interface operator(.MULTIC.)
      module procedure mult
   end interface

   interface operator(.MULTFD.)
      module procedure mult_for_dummy
   end interface

   B = RESHAPE( (/ 1., 0., 2., 4. /) , (/2,2/) )
   C = RESHAPE( (/ 1., 0., 1., 2. /) , (/2,2/) )

   write(*,'(/" --- matrix B ---"/)')
   write(*,'(4g10.2)')B

   write(*,'(/" --- matrix C ---"/)')
   write(*,'(4g10.2)')C

   A = B .MULTIC. C

   write(*,'(/" --- matrix A = B x C ---"/)')
   write(*,'(4g10.2)')A

   A = B * C

   write(*,'(/" --- matrix A = B*C (intrinsic)---"/)')
   write(*,'(4g10.2)')A

   A = B
   A = A .MULTFD. A
   write(*,'(/" --- matrix A = B x B---"/)')
   write(*,'(4g10.2)')A

   write(*,'(/" --- the end ---"/)')

return

end subroutine sample91

program main
call sample91
end



This archive was generated by hypermail 2b29 : Wed Feb 20 2002 - 03:20:05 EST