Skip to content

somasekhar95/SAAK-transform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SAAK-transform

One of the key characteristics of the CNN approach is that features can be learned automatically from the data and their labels through backpropagation. These iteratively optimized features are called the deep or the learned features. Here, we provide a third approach to feature extraction based on multi-stage SAAK transforms. They are the SAAK features. To obtain SAAK features, we need data labels but use them in a different manner. Multi-stage SAAK coefficients of samples from the same class are computed to build feature vectors of that class. We should collect those SAAK coefficients that have higher discriminant power among object classes to reduce the feature dimension. The SAAK (Subspace approximation and augmented kernel) transform consists of three steps: 1) building the optimal linear subspace approximation with orthonormal bases using the second-order statistics of input vectors, 2) augmenting each transform kernel with its negative, 3) applying the rectified linear unit (ReLU) to the transform output. The Karhunen-Loeve transform (KLT) is used in the first step. The integration of Steps 2 and 3 is powerful since they resolve the sign confusion problem, remove the rectification loss and allow a straightforward implementation of the inverse Saak transform at the same time. Multiple Saak transforms are cascaded to transform images of a larger size. All Saak transform kernels are derived from the second-order statistics of input random vectors in a one-pass feedforward manner. Neither data labels nor backpropagation is used in kernel determination. I’ve implemented this process of SAAK coefficient generation in python using anaconda as the backend. I’ve installed several required packages like torch, sklearn etc. The SAAK coefficients are generated in 5 stages, reducing the feature set in each stage. For each input image, we select the non-overlapping block region with spatial size 2x2. Then calculate the variance of each block and remove the small variance blocks. Next, subtract its own mean of each block. Then perform the Principle Component Analysis (PCA) to the zero-mean block data and get the PCA transform matrix. Transform all the input data patches by selecting the important spectral components in the PCA matrix. Now we take this representative set of samples and determine the KLT basis functions. The DC coefficient is same as that in the RECOS transform, such that uniform harmonic mean distance is maintained from the unit circle circumference. The remaining AC coefficients are obtained by augmentation. Then we project the samples onto the augmented kernel set and apply ReLU.