[HelpOnLanguages] [TitleIndex] [WordIndex]

MultipleNormalizationTypes

At the moment, openModeller uses only one type of normalization based on simple scaling. However, to achieve better results with some algorithms it is desireable to allow different types of normalization. This page will document and discuss a proposal to allow different normalization techniques to be used by openModeller.


1) Create a new interface called "Normalizer", with the following virtual methods:

2) Create a class "ScaleNormalizer" that implements the "Normalizer" interface. This class will receive min and max as parameters in the constructor, as well as useLayerAsReference to indicate if min and max references should be taken from the samples or the layer, and will have the following properties:

3) Create a new property "Normalizer * _normalizerPtr" in the AlgorithmImpl class, initialized as a null pointer (0), and remove the properties _norm_offsets, _norm_scales, and _has_norm_params.

4) Algorithms will need to instantiate the desired normalizer in the initialize method, assigning it to the _normalizerPtr property. Algorithms will be responsible for releasing their memory too (dtor).

5) Implement needNormalization in Algorithm.hh as:

return ( _normalizerPtr == 0 ) ? 0 : 1;

4) Create a new method in Algorithm.hh as:

Normalizer * getNormalizer();

which will return _normalizerPtr

5) In the createModel method of Algorithm.cpp, we have this line:

computeNormalization( _samp );
setNormalization( _samp );

It should be replaced by:

if ( needNormalization() ) {

  if ( _normalizerPtr == 0 ) {

    throw AlgorithmException( "Algorithm indicated the need of normalization but did not instantiate any normalizer." );
  }

  _normalizerPtr->computeNormalization();

  setNormalization( _samp );
}

6) Implement computeNormalization in ScaleNormalizer as:

void
ScaleNormalizer::computeNormalization( const ConstSamplerPtr& samp )
{
  int dim = samp->numIndependent();
  Sample min(dim), max(dim);
  samp->getMinMax( &min, &max, _useLayerAsReference );

  _scales.resize(dim);
  _offsets.resize(dim);

  for (int i = 0; i < dim; ++i) {

    _scales[i] = (_max - _min) / (max[i] - min[i]);
    _offsets[i] = _min - _scales[i] * min[i];
  }
}

And then remove computeNormalization from Algorithm.hh/.cpp

7) Change getMinMax in Sampler.cpp to:

void SamplerImpl::getMinMax( Sample * min, Sample * max, bool useLayerAsReference ) const
{
  if ( useLayerAsReference ) {

    if ( _env ) { 

      _env->getMinMax( min, max );

      return;
    }
    else {

      // no environment object exists, so normalize samples in occs objects
      g_log.warn( "No environment set. Could not get min/max from layers to normalize values.\n");
    }
  }

  // first get all occurrence objects in the same container
  OccurrencesPtr allOccs( new OccurrencesImpl( _presence->name(),
                                               _presence->coordSystem() ) );
  allOccs->appendFrom( _presence );
  allOccs->appendFrom( _absence );

  // now compute normalization parameters
  allOccs->getMinMax( min, max );
}

8) Change the implementations of setNormalization in Algorithm.cpp to:

void
AlgorithmImpl::setNormalization( const SamplerPtr& samp) const
{
  samp->normalize( needNormalization(), _normalizerPtr );
}

void
AlgorithmImpl::setNormalization( const EnvironmentPtr& env) const
{
  env->normalize( needNormalization(), _normalizerPtr );
}

9) Change the "normalize" calls in Sampler.cpp to

...->normalize( use_normalization, _normalizerPtr );

10) Change method normalize in Environment.cpp to:

void
EnvironmentImpl::normalize( bool use_normalization, const Normalizer * normalizerPtr ) {
  _normalize = use_normalization;
  _normalizerPtr = normalizerPtr;
}

Remove properties _scales and _offsets.

11) Change method getNormalized in Environment.cpp to:

Sample
EnvironmentImpl::getNormalized( Coord x, Coord y ) const
{
  Sample sample;

  getUnnormalizedInternal( &sample, x, y );

  if ( _normalize ) {

    _normalizerPtr->normalize( &sample );
  }

  return sample;
}

12) Implement normalize in ScaleNormalizer as:

void
ScaleNormalizer::normalize( Sample * sample )
{
  if ( sample->size() != 0 ) {
    *sample *= _scales;
    *sample += _offsets;
  }
}

13) Change normalize method in Occurrence.cpp to:

void
OccurrenceImpl::normalize(  bool use_normalization, const Normalizer * normalizerPtr )
{
  normEnv_ = unnormEnv_;

  int dim = normEnv_.size();

  for( int i = 0; i < dim; i++ ) {

    normalizerPtr->normalize( &normEnv_[i] );
  }
}

14) Change Normalizable interface

15) Change get and set configuration methods in Algorithm.cpp

16) Should we have Normalizer::getId() ?


2014-08-13 10:37