Skip to content
/ muHVT Public

Constructing hierarchical Voronoi tessellations for a given data set and overlay heatmaps for variables at various levels of the tessellations for in-depth data analysis. Credits to Mu Sigma for their continuous support throughout the development of the package.

License

Notifications You must be signed in to change notification settings

Mu-Sigma/muHVT

Repository files navigation

muHVT: Collection of functions used to build hierarchical topology preserving maps

Zubin Dowlaty, Shubhra Prakash, Sangeet Moy Das, Shantanu Vaidya, Praditi Shah, Srinivasan Sudarsanam, Somya Shambhawi

2023-06-07

1 Abstract

The muHVT package is a collection of R functions to facilitate building topology preserving maps for rich multivariate data analysis, see Figure 1 as an example of a 2D torus map generated from the package. Tending towards a big data preponderance, a large number of rows. A collection of R functions for this typical workflow is organized below:

  1. Data Compression: Vector quantization (VQ), HVQ (hierarchical vector quantization) using means or medians. This step compresses the rows (long data frame) using a compression objective.

  2. Data Projection: Dimension projection of the compressed cells to 1D,2D or 3D with the Sammons Non-linear Algorithm. This step creates topology preserving map (also called an embedding) coordinates into the desired output dimension.

  3. Tessellation: Create cells required for object visualization using the Voronoi Tessellation method, package includes heatmap plots for hierarchical Voronoi tessellations (HVT). This step enables data insights, visualization, and interaction with the topology preserving map useful for semi-supervised tasks.

  4. Prediction: Scoring new data sets and recording their assignment using the map objects from the above steps, in a sequence of maps if required.

The muHVT package allows creation of visually stunning tessellations, showcasing the power of topology preserving maps. Below is an image depicting a captivating tessellation of a torus, see vignette for more details.

Figure 1: The Voronoi tessellation for layer 1 and number of cells 900 with the heat map overlaid for variable z.

2 Version History

2.1 muHVT (v23.06.07) | What’s New?

07th June, 2023

In this version of muHVT package, the following new features have been introduced:

This package provides functionality to predict cells with layers based on a sequence of maps using predictLayerHVT.

2.2 muHVT (v22.12.06)

06th December, 2022

This package provides functionality to predict based on a sequence of maps.

The creation of a predictive set of maps involves three steps -

  1. Compress: Compress the dataset using a percentage compression rate and a quantization threshold using the HVT() function (Map A).
  2. Remove novelty cells: Manually identify and remove the novelty cells from the dataset using the removeNovelty() function (Map B).
  3. Compress the dataset without novelty: Again, compress the dataset without novelty using n_cells, depth and a quantization threshold using the HVT() function (Map C).

Let us try to understand the steps with the help of the diagram below -

Figure 2: Flow diagram for predicting based on a sequence of maps using predictLayerHVT()

3 Installation of muHVT (v23.06.07)

library(devtools)
devtools::install_github(repo = "Mu-Sigma/muHVT")

4 Vignettes

Following are the links to the vignettes for the muHVT package:

4.1 muHVT Vignette

muHVT Vignette: Contains descriptions of the functions used for vector quantization and construction of hierarchical voronoi tessellations for data analysis.

4.2 muHVT Model Diagnostics Vignette

muHVT Model Diagnostics Vignette: Contains descriptions of functions used to perform model diagnostics and validation for muHVT model.

4.3 muHVT - Predicting Cells with Layers using predictLayerHVT

muHVT : Predicting Cells with Layers using predictLayerHVT : Contains descriptions of the functions used for predicting cells with layers based on a sequence of maps using predictLayerHVT.

About

Constructing hierarchical Voronoi tessellations for a given data set and overlay heatmaps for variables at various levels of the tessellations for in-depth data analysis. Credits to Mu Sigma for their continuous support throughout the development of the package.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages