Skip to content

Functions that filter UMAP graphs using domain knowledge.

License

Notifications You must be signed in to change notification settings

vda-lab/lensed_umap

Repository files navigation

PyPI version Tests Repository DOI

Lensed UMAP

Lensed UMAP provides three methods that apply lens-functions to a UMAP model. Lens functions can be used to untangle embeddings along a particular dimension. This dimension may be part of the data, or come from another information source. Using lens functions, analysts can update their UMAP models to the questions they are investigating, effectively viewing their data from different perspectives.

How to use Lensed UMAP

The lensed UMAP package provides functions that operate on (fitted) UMAP objects.

import numpy as np
import pandas as pd
from umap import UMAP
import lensed_umap as lu
import matplotlib.pyplot as plt

# Load data and extract lens
df = pd.read_csv("./data/five_circles.csv", header=0)
lens = np.log(df.hue)

# Compute initial UMAP model
projector = UMAP(
    repulsion_strength=0.1,  # To avoid tears in projection that
    negative_sample_rate=2,  # are not in the modelled graph!
).fit(df[["x", "y"]])

# Draw intial model
x, y = lu.extract_embedding(projector)
plt.scatter(x, y, 2, lens, cmap="viridis")
plt.axis("off")
plt.show()

Initial UMAP model

# Apply a global lens
lensed = lu.apply_lens(projector, lens, resolution=6)
x, y = lu.extract_embedding(lensed)
plt.scatter(x, y, 2, lens, cmap="viridis")
plt.axis("off")
plt.show()

Lensed model

Example Notebooks

A notebook demonstrating how the package works is available at How lensed UMAP Works. The other notebooks demonstrate lenses on several data sets and contain the analyses presented in our paper. The datasets we used as input and the data generated by our notebooks are stored using git lfs, which turns the files in this repository into versioned links to the actual data files. Their documentation explains how to retrieve the actual data files.

Installing

lensed_umap is available on PyPI:

pip install lensed_umap

Citing

A scientific paper describing our work is available on Arxiv:

@misc{bot2024lens,
  title={Lens functions for exploring UMAP Projections with Domain Knowledge}, 
  author={Daniel M. Bot and Jan Aerts},
  year={2024},
  eprint={2405.09204},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

Licensing

The lensed UMAP package has a 3-Clause BSD license.