Skip to content

v0.8.0 (February, 2022)

Latest
Compare
Choose a tag to compare
@paxtonfitzpatrick paxtonfitzpatrick released this 12 Feb 03:29
· 16 commits to master since this release
564c1d4

updates to .geo file format

Hypertools now saves DataGeometry objects using the pickle file format internally, rather than HDF5. With improvements made to the built-in pickle module since Hypertools's initial release, this now generally results in smaller files that save and load more quickly. It also allows us to no longer depend on deepdish, which has compatibility issues with various pandas objects, doesn't offer pre-built wheels for more recent Python versions, and is largely no longer maintained.

If you need to load .geo files from the old format, hypertools.load now accepts a keyword-only argument, legacy. Install deepdish if necessary, and pass legacy=True to load older DataGeometry objects. You can then .save() them to convert them to the new format.

improvements to example datasets

All example data files have been upgraded to the new file format. Additionally, the three pre-trained scikit-learn Pipelines Hypertools provides (wiki_model, nips_model, and sotus_model) have been recreated from scratch using a newer scikit-learn version, better text preprocessing, and updated CountVectorizer and LatentDirichletAllocation parameters that result in overall better models.

The example DataGeometry objects associated with these three models (wiki, nips, and sotus) have been updated accordingly, and additionally now use IncrementalPCA as their default reducers, resulting in faster, deterministic transform outputs.

To use the new models and datasets, upgrade Hypertools to v0.8.0 (pip install -U hypertools) and remove the local cache of old versions ([[ -d ~/hypertools_data ]] && rm ~/hypertools_data/*). Older versions of Hypertools will continue to use the old example data.

Other improvements

  • Hypertools is now compatible with Python 3.9! This release is also compatible in principle with Python 3.10, but numba does not yet support Python 3.10, so certain dependencies will fail to install.
  • Hypertools now works with newer scikit-learn versions! The updates above to the example datasets make Hypertools fully compatible with recent scikit-learn releases (>=0.24). This should make Hypertools easier to use in Colaboratory notebooks and more flexible in general. If you need to use an older scikit-learn version, pip-install hypertools<0.8.0.
  • Hypertools now works with newer Matplotlib versions! Recent updates to matplotlib's plotting backends were causing Hypertools's plotting interface to fail on import. We've fixed these bugs and maintained backwards compatibility with older (deprecated) interactive plotting backends as well.

Other assorted changes

  • failures when loading example datasets and .geo files now raise HypertoolsIOError with clearer error messages
  • specifying a compression when saving a DataGeometry object raises a FutureWarning
  • CI tests now run with Python 3.6 -- 3.9, use mamba for faster environment setup, and generate more verbose output
  • dependencies and code required for Python 2/3 compatibility have been removed
  • various code causing RuntimeWarnings has been fixed