Skip to content
Marc Claesen edited this page Sep 28, 2013 · 13 revisions

SVMEnsemble models a binary ensemble classifier with SVM base models. Models can be aggregated in an ensemble as long as they use an identical kernel function (including kernel parameters). When a model is added to an ensemble, its support vectors are analysed to determine uniqueness.

Ensembles do not contain duplicate support vectors. During ensemble prediction, duplicate support vectors between base models are only used once in kernel evaluations. This makes our ensemble implementation significantly more efficient in prediction than a solution involving wrappers can ever be.

The actual implementation of SVMEnsemble is hidden using the pointer-to-implementation idiom, combined with type erased iterators (see below). This allows us to change the internal representation of ensembles when deemed appropriate without inducing changes or recompilation of user code. Updated versions of the SVMEnsemble class can be introduced trivially.

A note on any_iterator

In order to truly decouple the implementation of SVMEnsemble with its interface, we had to perform type erasure on the iterators of internal data structures. To perform iterator type erasure, we use the code of Type Erasure for C++ Iterators by Thomas Becker, which in turn relies on parts of boost.

These iterators generally behave like typical STL iterators. There is one crucial limitation to be aware of: type erased iterators that point to the same sequence but are of different type are not interoperable. More information can be found here. The main issue is illustrated with this example, taken from Thomas Becker's documentation:

std::vector<int> int_vector;
std::vector<int>::iterator it = int_vector.begin();
std::vector<int>::const_iterator cit = int_vector.begin();

The iterators it and cit are interoperable, and operations like it==cit are valid. When performing type erasure on it and cit, the resulting iterators are not interoperable, because the original iterators it and cit are of distinct types.

typedef any_iterator<int, boost::random_access_traversal_tag,  int const &> 
  random_access_const_iterator_to_int;

random_access_const_iterator_to_int ait_1 = it;
random_access_const_iterator_to_int ait_2 = cit;
then the comparison
ait_1 == ait_2; // bad comparison!