Skip to content

v0.7.0

Compare
Choose a tag to compare
@tonyhoo tonyhoo released this 17 Feb 07:00
· 606 commits to master since this release
ab43ffb

Version 0.7.0

We're happy to announce the AutoGluon 0.7 release. This release contains a new experimental module autogluon.eda for exploratory
data analysis. AutoGluon 0.7 offers conda-forge support, enhancements to Tabular, MultiModal, and Time Series
modules, and many quality of life improvements and fixes.

As always, only load previously trained models using the same version of AutoGluon that they were originally trained on.
Loading models trained in different versions of AutoGluon is not supported.

This release contains 170 commits from 19 contributors!

See the full commit change-log here: v0.6.2...v0.7.0

Special thanks to @MountPOTATO who is a first time contributor to AutoGluon this release!

Full Contributor List (ordered by # of commits):

@Innixma, @zhiqiangdon, @yinweisu, @gradientsky, @shchur, @sxjscience, @FANGAreNotGnu, @yongxinw, @cheungdaven,
@liangfu, @tonyhoo, @bryanyzhu, @suzhoum, @canerturkmen, @giswqs, @gidler, @yzhliu, @Linuxdex and @MountPOTATO

AutoGluon 0.7 supports Python versions 3.8, 3.9, and 3.10. Python 3.7 is no longer supported as of this release.

Changes

NEW: AutoGluon available on conda-forge

As of AutoGluon 0.7 release, AutoGluon is now available on conda-forge (#612)!

Kudos to the following individuals for making this happen:

  • @giswqs for leading the entire effort and being a 1-man army driving this forward.
  • @h-vetinari for providing excellent advice for working with conda-forge and some truly exceptional feedback.
  • @arturdaraujo, @PertuyF, @ngam and @priyanga24 for their encouragement, suggestions, and feedback.
  • The conda-forge team for their prompt and effective reviews of our (many) PRs.
  • @gradientsky for testing M1 support during the early stages.
  • @sxjscience, @zhiqiangdon, @canerturkmen, @shchur, and @Innixma for helping upgrade our downstream dependency versions to be compatible with conda.
  • Everyone else who has supported this process either directly or indirectly.

NEW: autogluon.eda (Exploratory Data Analysis)

We are happy to announce AutoGluon Exploratory Data Analysis (EDA) toolkit. Starting with v0.7, AutoGluon now can analyze and visualize different aspects of data and models. We invite you to explore the following tutorials: Quick Fit, Dataset Overview, Target Variable Analysis, Covariate Shift Analysis. Other materials can be found in EDA Section of the website.

General

  • Added Python 3.10 support. @Innixma (#2721)
  • Dropped Python 3.7 support. @Innixma (#2722)
  • Removed dask and distributed dependencies. @Innixma (#2691)
  • Removed autogluon.text and autogluon.vision modules. We recommend using autogluon.multimodal for text and vision tasks going forward.

AutoMM

AutoGluon MultiModal (a.k.a AutoMM) supports three new features: 1) document classification; 2) named entity recognition
for Chinese language; 3) few shot learning with SVM

Meanwhile, we removed autogluon.text and autogluon.vision as these features are supported in autogluon.multimodal

New features

  • Document Classification
    • Add scanned document classification (experimental).
    • Customers can train models for scanned document classification in a few lines of codes
    • See tutorials
    • Contributors and commits: @cheungdaven (#2765, #2826, #2833, #2928)
  • NER for Chinese Language
  • Few Shot Learning with SVM

Other Enhancements

Bug Fixes / Code and Doc Improvements

Deprecations

Tabular

  1. TabularPredictor’s inference speed has been heavily optimized, with an average 250% speedup for real-time inference. This means that TabularPredictor can satisfy <10 ms end-to-end latency on many datasets when using infer_limit, and the high_quality preset can satisfy <100 ms end-to-end latency on many datasets by default.
  2. TabularPredictor’s "multimodal" hyperparameter preset now leverages the full capabilities of MultiModalPredictor, resulting in stronger performance on datasets containing a mix of tabular, image, and text features.

Performance Improvements

  • Upgraded versions of all dependency packages to use the latest releases. @Innixma (#2823, #2829, #2834, #2887, #2915)
  • Accelerated ensemble inference speed by 150% by removing TorchThreadManager context switching. @liangfu (#2472)
  • Accelerated FastAI neural network inference speed by 100x+ and training speed by 10x on datasets with many features. @Innixma (#2909)
  • (From 0.6.1) Avoid unnecessary DataFrame copies to accelerate feature preprocessing by 25%. @liangfu (#2532)
  • (From 0.6.1) Refactor NN_TORCH model to be dataset iterable, leading to a 100% inference speedup. @liangfu (#2395)
  • MultiModalPredictor is now used as a member of the ensemble when TabularPredictor.fit is passed hyperparameters="multimodal". @Innixma (#2890)

API Enhancements

  • Added predict_multi and predict_proba_multi methods to TabularPredictor to efficiently get predictions from multiple models. @Innixma (#2727)
  • Allow label column to not be present in leaderboard calls when scoring is disabled. @Innixma (#2912)

Deprecations

  • Added a deprecation warning when calling predict_proba with problem_type="regression". This will raise an exception in a future release. @Innixma (#2684)

Bug Fixes / Doc Improvements

  • Fixed incorrect time_limit estimation in NN_TORCH model. @Innixma (#2909)
  • Fixed error when fitting with only text features. @Innixma (#2705)
  • Fixed error when calibrate=True, use_bag_holdout=True in TabularPredictor.fit. @Innixma (#2715)
  • Fixed error when tuning n_estimators with RandomForest / ExtraTrees models. @Innixma (#2735)
  • Fixed missing onnxruntime dependency on Linux/MacOS when installing optional dependency skl2onnx. @liangfu (#2923)
  • Fixed edge-case RandomForest error on Windows. @yinweisu (#2851)
  • Added improved logging for refit_full. @Innixma (#2913)
  • Added compile_models to the deployment tutorial. @liangfu (#2717)
  • Various internal code refactoring. @Innixma (#2744, #2887)
  • Various doc and logging improvements. @Innixma (#2668)

autogluon.timeseries

New features

  • TimeSeriesPredictor now supports past covariates (a.k.a.dynamic features or related time series which is not known for time steps to be predicted). @shchur (#2665, #2680)
  • New models from StatsForecast got introduced in TimeSeriesPredictor for various presets (medium_quality, high_quality and best_quality). @shchur (#2758)
  • Support missing value imputation for TimeSeriesDataFrame which allows users to customize filling logics for missing values and fill gaps in an irregular sampled times series. @shchur (#2781)
  • Improve quantile forecasting performance of the AutoGluon-Tabular forecaster using the empirical noise distribution. @shchur (#2740)

Bug Fixes / Doc Improvements