Blitz++ Stencils (fwd)

From: tveldhui (tveldhui@extreme.indiana.edu)
Date: Fri Dec 18 1998 - 09:59:04 EST


Forwarded message:
>From petern@nada.kth.se Thu Nov 12 07:50:24 1998
Sender: petern@nada.kth.se
Message-ID: <364AD987.E6EC28B6@nada.kth.se>
Date: Thu, 12 Nov 1998 13:50:15 +0100
From: Peter Nordlund <petern@nada.kth.se>
Organization: KTH, NADA, CVAP
X-Mailer: Mozilla 4.06 [en] (X11; I; SunOS 5.5.1 sun4u)
MIME-Version: 1.0
To: tveldhui@oonumerics.org
Subject: Blitz++ Stencils
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Hi Todd,

I sent a mail quite a while ago concerning the
stencil stuff.

I guess you have been busy with other things, but I just wanted
to check out how things are going.

I am particularly interested in the functionality of the stencils,
and boundary conditions,
since I plan to do some image processing.

I have also thought a bit (a very little bit) about the way
the stencils are implemented. As it is now, when you call
the applyStencil function there are some dummy-functions called to find
out the footprint-size of the operator. Wouldn't it be better if this
was made
as a separate call? If you for example have a small array and
call applyStencil in a tight loop there would be quite a lot
of unneccesary calls to the footprint-size functions.

My suggestion is to have a separate function finding out the
size of the operator, returning a range object. This range object is
used as an input argument to the applyStencil function.
I hope this is possible to implement, I haven't checked if it is.

I enclose my old mail below.

Regards,
Peter

--------------------------------------------------------------------
Subject:
           BZDEV: Concerning stencils, and iterators
       Date:
           Wed, 16 Sep 1998 18:29:18 +0200
      From:
           Peter Nordlund <petern@nada.kth.se>
   Reply-To:
           blitz-dev@oonumerics.org
 Organization:
           KTH, NADA, CVAP
         To:
           blitz-dev@oonumerics.org

Concerning stencils, and iterators:

I want to make fast assignments for an array of HSVf objects,
which are very similar to the HSV24 object example
in the manual section 2.10.1.

To do that I could use stencils, or iterators.

Iterators doesn't seem to be fully implemented yet. I would like
to have iterators sweeping through all elements in some order that
is the same for all arrays of the same size and rank.
I have not thought so much about implications from having
ranges not using all elements in the array.
Maybe it is very difficult to design a good iterator, that will handle
all different storage orders and sparse samplings etc. etc.?
(Could also be that the functions are there but I don't know how to use
them.)
The iterator functions that I miss so far are:
1. Some kind of next_element or advance function that will go
to the next element in the array.
2. A reference function providing me with a reference to the current
element.
3. An end iterator so that I would know when to stop looping.
4. An index function returning the index of the current element (a
tinyvec<N_rank>).
 (Could be good for several reasons, e.g. making a list of elements with
some
  particular property.....)
------------
I would also like to be able to call member functions of the
Array-elements
inside a stencil.
Would that be a reasonable request? - I dont know. - Anyway, I think
it would be a good idea, if it isn't please tell me.
A reason for this request could be to gain some excecution speed.
In the code provided below I have tested with a construction that is
possible with blitz-19980903, using a constructor call in the assignment
(the stencil hsv).
That's the most clever thing I could come up with,
maybe someone else knows a better approach?

Then I hacked the iterator code and the stencilExtent object to make it
possible to get a reference from the iterator, and then to make a
subsequent
call to a member function making the assignment.
(the stencil hsvref).

Using the different approaches I get the following timing results for
a 380x380 pixels image on my SGI Octane with egcs-1.1b

 1.1668 seconds for Iterator code 10 loops
0.11668 seconds/loop
 1.7617 seconds for blitz code 10 loops (stencil hsv, using constructor
call)
0.17617 seconds/loop
 1.1304 seconds for blitz code using refs 10 loops (stencil hsvref)
0.11304 seconds/loop

So, I gain quite a lot of speed by not using a constructor call in the
stencil.
Below I provide my own code, and also the altered stencil.h code.
Please comment if my new stencil.h code seems to be a good idea.

Best regards,
Peter
//---------------------------------------------
// hsv.h
#ifndef _castests_hsv_h_
#define _castests_hsv_h_

class HSVf {
public:
  friend ostream& operator<<(ostream& os, const HSVf& hsv);

  explicit HSVf() { }
  template <class T> explicit HSVf(T r, T g, T b) {
    assignFromRGB(r, g, b);
  }
  // Accessors
  const float h() const { return h_; }
  const float s() const { return s_; }
  const float v() const { return v_; }
  // Mutators
  void assignFromRGB(unsigned char r, unsigned char g, unsigned char b)
{
    const double r_g_b = r+g+b;
    const double r_g = r-g;
    const double r_b = r-b;
    // Hue
    h_ = (255/(M_PI))*acos(0.5*(r_g + r_b)/
             max(0.00001, sqrt(r_g*r_g + r_b*(g - b))));
    // Saturation
    s_ = 255*(1 - 3*(min(r, min(g, b))/max(1.0, r_g_b)));
    // Value
    v_ = r_g_b/3;
  }
private:
  // Accessors
  ostream& print(ostream& os) const {
    return os <<"(h:"<<h()<<" s:"<<s()<<" v:"<<v() << ")";
  }
  // Data members
  float h_, s_, v_;
};

ostream& operator<<(ostream& os, const HSVf& hsv) { return
hsv.print(os); }

#endif /* _castests_hsv_h_ */
//---------------------------------------------
// hsv-algo.h
#ifndef _castests_hsv_algo_h_
#define _castests_hsv_algo_h_

#include "hsv.h"
#include <blitz/array.h>

//BZ_USING_NAMESPACE(blitz)
BZ_DECLARE_MULTICOMPONENT_TYPE(HSVf, float, 3);

// True if image contains skin-color
inline bool skin_seg(const HSVf& hsv) {
  return (hsv.h()<25) && (hsv.s()>30) && (hsv.s()<150) &&
    (hsv.v()>60) && (hsv.v()<220);
}

// Assign a HSV image from R,G,B using constructor call
BZ_DECLARE_STENCIL4(hsv, HSV, R, G, B)
  HSV = HSVf(R, G, B);
BZ_END_STENCIL

// Assign a HSV image from R,G,B using member function
BZ_DECLARE_STENCIL4(hsvref, HSV, R, G, B)
  HSV.ref().assignFromRGB(R, G, B);
BZ_END_STENCIL

// Segment skin image
BZ_DECLARE_STENCIL2(skin_segment, res, HSV)
  res = skin_seg(HSV);
BZ_END_STENCIL

#endif /* _castests_hsv_algo_h_ */

//---------------------------------------------
// main.cc
#define FunctionString \
RCSHEADER"\n\nFunction:Testing blitz stuff.\n"

#include <fstream.h>
#include <blitz/timer.h>
#include <blitz/array.h>

BZ_USING_NAMESPACE(blitz)

// Just for reading the images from disk
#include <canblitz/canblitz-io.h>

#include "hsv-algo.h"

static int arg_infile = STDIN;
static int arg_mask = 0;
static int arg_outfile = STDOUT;
int verbose = 0;
static int noLoops = 100;
static int can_flag = 0;
static int tmp_flag = 0;
static float threshold = 0;
optsDescription util_opts[] =
{
    CANDELA_STDOPTS(optsBitInvis),
    {"-help", "Print this help text", optsTypeHelp, (void *) NULL,
FunctionString},
    {"-usage", "Print usage", optsTypeUsage},
    {"-loops", "Number of loops", optsTypeInt, &noLoops},
    {"-verbose", "Verbose printing level", optsTypeInt, &verbose},
    {"-tmp", "Set tmp_flag for testing code", optsTypeFlag, &tmp_flag},
    {"-threshold", "Thresholding", optsTypeFloat, &threshold},
    {"-can", "Ouput candela pic (otherwise blitz pic", optsTypeFlag,
&can_flag},
    {"-infile", "In picture", optsTypeArg, &arg_infile, NULL,
can_conv_open},
    {"-outfile", "Outpic", optsTypeArg, &arg_outfile, NULL,
can_conv_create}
};

typedef float dtype;
typedef HSVf HSV_t;

int main (int argc, char **argv)
{
  try {
    // Parsing the command line
    // Observe that this code will not compile,
    // unless you have my local code, which you don't have!
    can_initialize (&argc, argv, util_opts, sizeof (util_opts));

    Array<dtype,2> R,G,B;

    cerr << "Before opening stream" << endl;
    {
      // Reading images from disk. Stored in local image format
"Candela"
      CanIStream ci(arg_infile);
      ci >> R >> G >> B;
    }
    cerr << "Closed stream" << endl;

    Array<HSV_t,2> HSV(R.shape());

    
    if (verbose > 5) {
      cout << "R = " << R << endl;
      cout << "G = " << G << endl;
      cout << "B = " << B << endl;
    }

    // Testing iterator loop,
    // NOT NICE CODE, but it works for testing purposes right now.
    if (tmp_flag) {
      Timer timer;
      timer.start();

      Array<bool,2> skin_res(R.shape());
      Array<HSV_t,2> HSV_comp(R.shape());
      for (int i = 0; i < noLoops; i++) {
        Array<dtype,2>::T_iterator RIter = R.begin();
        Array<dtype,2>::T_iterator GIter = G.begin();
        Array<dtype,2>::T_iterator BIter = B.begin();
        Array<HSV_t,2>::T_iterator HSVIter = HSV_comp.begin();
        
        for (int i = 0; i<R.size(); ++i){
          HSVIter.ref().assignFromRGB(*RIter, *GIter, *BIter);

          // Increment iterators
          RIter.advanceUnitStride();
          GIter.advanceUnitStride();
          BIter.advanceUnitStride();
          HSVIter.advanceUnitStride();
        }
      }
      timer.stop();
      cout << setw(7) << setprecision(5) << (timer.elapsedSeconds())
           << " seconds for Iterator code " <<
        noLoops << " times" << endl;
      cout << setw(7) << setprecision(5) <<
(timer.elapsedSeconds())/noLoops
           << " seconds/loop" << endl;
      applyStencil(skin_segment(), skin_res, HSV_comp);
      if (verbose > 0) {
        cerr << "skin_res:" << skin_res << endl;
      }
    }

    // Code using Blitz original stencil code
    // It seems like I have to make a constructor call inside the
stencil....
    {
      Timer timer;
      timer.start();

      Array<bool,2> skin_res(R.shape());
      Array<HSV_t,2> HSV_comp(R.shape());
      for (int i = 0; i < noLoops; i++) {
        // Using constructor call
        applyStencil(hsv(), HSV_comp, R, G, B);
      }
      timer.stop();
      cout << setw(7) << setprecision(5) << (timer.elapsedSeconds())
           << " seconds for blitz code " <<
        noLoops << " times" << endl;
      cout << setw(7) << setprecision(5) <<
(timer.elapsedSeconds())/noLoops
           << " seconds/loop" << endl;
      applyStencil(skin_segment(), skin_res, HSV_comp);
      if (verbose > 0) {
        cerr << "skin_res:" << skin_res << endl;
      }
    }

    // Blitz loop using references
    {
      Timer timer;
      timer.start();

      Array<HSV_t,2> HSV_comp(R.shape());
      Array<bool,2> skin_res(R.shape());
      for (int i = 0; i < noLoops; i++) {
        // Using member function call
        applyStencil(hsvref(), HSV_comp, R, G, B);
      }
      timer.stop();
      cout << setw(7) << setprecision(5) << (timer.elapsedSeconds())
           << " seconds for blitz code using refs " <<
        noLoops << " times" << endl;
      cout << setw(7) << setprecision(5) <<
(timer.elapsedSeconds())/noLoops
           << " seconds/loop" << endl;
      applyStencil(skin_segment(), skin_res, HSV_comp);
      if (verbose > 0) {
        cerr << "skin_res:" << skin_res << endl;
        Array<dtype,2> HSV0(R.shape()), HSV1(R.shape()),
HSV2(R.shape());
        HSV0 = HSV_comp[0];
        HSV1 = HSV_comp[1];
        HSV2 = HSV_comp[2];
        cout << "HSV[0] = " << HSV0 << endl;
        cout << "HSV[1] = " << HSV1 << endl;
        cout << "HSV[2] = " << HSV2 << endl;
      }
    }
  }
  catch(const std::exception& ex) { cerr << ex.what() << endl; }
  catch(...) { cerr << "Fatal error"; }

  return 0;
}
//----------------------------------------------------------------
//In iter.h, just add
T_numtype& ref() { return const_cast<T_numtype&>(*data_); }
// inside the ArrayIterator class
//----------------------------------------------------------------
// In stencil.h the class stencilExtent has been changed
// to the following:

template<int N_rank, class P_numtype>
class stencilExtent {
public:
    typedef P_numtype T_numtype;

    stencilExtent()
    {
        min_ = 0;
        max_ = 0;
    }
#if 0 // This stuff has been replaced by my code
    dummy<T_numtype> operator()(int i)
    {
        update(0, i);
        return dummy<T_numtype>(1);
    }
 
    dummy<T_numtype> operator()(int i, int j)
    {
        update(0, i);
        update(1, j);
        return dummy<T_numtype>(1);
    }

    dummy<T_numtype> operator()(int i, int j, int k)
    {
        update(0, i);
        update(1, j);
        update(2, k);
        return dummy<T_numtype>(1);
    }

    dummy<T_numtype> shift(int offset, int dim)
    {
        update(dim, offset);
        return dummy<T_numtype>(1);
    }
  
    dummy<T_numtype> shift(int offset1, int dim1, int offset2, int dim2)
    {
        update(dim1, offset1);
        update(dim2, offset2);
        return dummy<T_numtype>(1);
    }
 
    dummy<_bz_typename multicomponent_traits<T_numtype>::T_element>
        operator[](int)
    {
        return dummy<_bz_typename
multicomponent_traits<T_numtype>::T_element>
            (1);
    }
#endif //petern

    // Here comes my new code....
    T_numtype& ref()
    {
        for (int i=0; i<N_rank; ++i)
          update(i, 0);
        return t_dummy_;
    }

    T_numtype& operator()(int i)
    {
        update(0, i);
        return t_dummy_;
    }
 
    T_numtype& operator()(int i, int j)
    {
        update(0, i);
        update(1, j);
        return t_dummy_;
    }

    T_numtype& operator()(int i, int j, int k)
    {
        update(0, i);
        update(1, j);
        update(2, k);
        return t_dummy_;
    }

    T_numtype shift(int offset, int dim)
    {
        update(dim, offset);
        return t_dummy_;
    }
  
    T_numtype shift(int offset1, int dim1, int offset2, int dim2)
    {
        update(dim1, offset1);
        update(dim2, offset2);
        return t_dummy_;
    }
 
    _bz_typename multicomponent_traits<T_numtype>::T_element
        operator[](int)
    {
        return _bz_typename
multicomponent_traits<T_numtype>::T_element();
    }
   // Here ends my new code... The rest is as before, except for
   // the new data member

    void update(int rank, int offset)
    {
        if (offset < min_[rank])
            min_[rank] = offset;
        if (offset > max_[rank])
            max_[rank] = offset;
    }

    template<class T_numtype2>
    void combine(const stencilExtent<N_rank,T_numtype2>& x)
    {
        for (int i=0; i < N_rank; ++i)
        {
            min_[i] = ::min(min_[i], x.min(i));
            max_[i] = ::max(max_[i], x.max(i));
        }
    }

    template<class T_numtype2>
    void combine(const dummy<T_numtype2>&)
    { }

    int min(int i) const
    { return min_[i]; }

    int max(int i) const
    { return max_[i]; }

    const TinyVector<int,N_rank>& min() const
    { return min_; }

    const TinyVector<int,N_rank>& max() const
    { return max_; }

    template<class T>
    void operator=(T)
    { }

    // NEEDS_WORK: other operators
    template<class T> void operator+=(T) { }
    template<class T> void operator-=(T) { }
    template<class T> void operator*=(T) { }
    template<class T> void operator/=(T) { }

    operator T_numtype()
// Minor change, don't know if it is good or not
//{ return T_numtype(1); }
  { return T_numtype(); }
    T_numtype operator*()
// { return T_numtype(1); }
    { return T_numtype(); }
 
private:
    _bz_mutable TinyVector<int,N_rank> min_, max_;
    // My new data member
    _bz_mutable T_numtype t_dummy_;
};



This archive was generated by hypermail 2b29 : Wed Feb 20 2002 - 04:30:08 EST