Proposal: Hierarchical Dimensionality Reduction module #729

dylanrstewart · 2022-07-28T02:52:26Z

Author of Proposal: Dylan Stewart

Reason or Problem

A common issue with multi-dimensional raster image processing (at the extremes, hyperspectral imagery with hundreds of features) is significant redundancy within the feature space. Some datasets have tens or hundreds of bands when only a handful might be necessary for downstream use (e.g., classification, segmentation, clustering).

Proposal

This module takes high dimensional data and a desired number of output channels or threshold, compares the distributions of the features within the data, and returns the most dissimilar grouping.

Design:

Given a dataset containing $N$ pixels and $F$ features, produce a pairwise-distance matrix:
$$C = F \times F,$$
where $C$ can be computed using various metrics (e.g., Jensen-Shannon divergence, a symmetric Kullback-Leibler divergence, Mahalanobis Distance Add Mahalanobis Distance Metric #114, Euclidean distance) evaluated over the distribution of pixels within the dataset.
Then, select the most similar pair of features (or spectra) by finding the minimum (for a distance/divergence measure) or maximum (similarity measure, e.g., mutual information or cosine similarity) and merge them by a specified aggregation (e.g., mean, median, max, min).
Update $C$ based on 2. until stopping criteria is met. Return dataset with reduced dimensionality.

Usage: for reducing the dimensionality of an input by finding correlating features within and removing redundancy.

Value: provide support to high-dimensional raster processing applications (e.g., data fusion, hyperspectral, multispectral)

Additional Notes or Context

Some distance metrics already available to build from:

cupy KL divergence function
scipy KL divergence
Other distance/similarity metrics are easy to implement (Euclidean and cosine)

dylanrstewart added the enhancement New feature or request label Jul 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Hierarchical Dimensionality Reduction module #729

Proposal: Hierarchical Dimensionality Reduction module #729

dylanrstewart commented Jul 28, 2022

Proposal: Hierarchical Dimensionality Reduction module #729

Proposal: Hierarchical Dimensionality Reduction module #729

Comments

dylanrstewart commented Jul 28, 2022

Reason or Problem

Proposal

Additional Notes or Context