v0.7.0
Version 0.7.0
We're happy to announce the AutoGluon 0.7 release. This release contains a new experimental module autogluon.eda
for exploratory
data analysis. AutoGluon 0.7 offers conda-forge support, enhancements to Tabular, MultiModal, and Time Series
modules, and many quality of life improvements and fixes.
As always, only load previously trained models using the same version of AutoGluon that they were originally trained on.
Loading models trained in different versions of AutoGluon is not supported.
This release contains 170 commits from 19 contributors!
See the full commit change-log here: v0.6.2...v0.7.0
Special thanks to @MountPOTATO who is a first time contributor to AutoGluon this release!
Full Contributor List (ordered by # of commits):
@Innixma, @zhiqiangdon, @yinweisu, @gradientsky, @shchur, @sxjscience, @FANGAreNotGnu, @yongxinw, @cheungdaven,
@liangfu, @tonyhoo, @bryanyzhu, @suzhoum, @canerturkmen, @giswqs, @gidler, @yzhliu, @Linuxdex and @MountPOTATO
AutoGluon 0.7 supports Python versions 3.8, 3.9, and 3.10. Python 3.7 is no longer supported as of this release.
Changes
NEW: AutoGluon available on conda-forge
As of AutoGluon 0.7 release, AutoGluon is now available on conda-forge (#612)!
Kudos to the following individuals for making this happen:
- @giswqs for leading the entire effort and being a 1-man army driving this forward.
- @h-vetinari for providing excellent advice for working with conda-forge and some truly exceptional feedback.
- @arturdaraujo, @PertuyF, @ngam and @priyanga24 for their encouragement, suggestions, and feedback.
- The conda-forge team for their prompt and effective reviews of our (many) PRs.
- @gradientsky for testing M1 support during the early stages.
- @sxjscience, @zhiqiangdon, @canerturkmen, @shchur, and @Innixma for helping upgrade our downstream dependency versions to be compatible with conda.
- Everyone else who has supported this process either directly or indirectly.
NEW: autogluon.eda
(Exploratory Data Analysis)
We are happy to announce AutoGluon Exploratory Data Analysis (EDA) toolkit. Starting with v0.7, AutoGluon now can analyze and visualize different aspects of data and models. We invite you to explore the following tutorials: Quick Fit, Dataset Overview, Target Variable Analysis, Covariate Shift Analysis. Other materials can be found in EDA Section of the website.
General
- Added Python 3.10 support. @Innixma (#2721)
- Dropped Python 3.7 support. @Innixma (#2722)
- Removed
dask
anddistributed
dependencies. @Innixma (#2691) - Removed
autogluon.text
andautogluon.vision
modules. We recommend usingautogluon.multimodal
for text and vision tasks going forward.
AutoMM
AutoGluon MultiModal (a.k.a AutoMM) supports three new features: 1) document classification; 2) named entity recognition
for Chinese language; 3) few shot learning with SVM
Meanwhile, we removed autogluon.text
and autogluon.vision
as these features are supported in autogluon.multimodal
New features
- Document Classification
- NER for Chinese Language
- Support Chinese named entity recognition
- See tutorials
- Contributors and commits: @cheungdaven (#2676, #2709)
- Few Shot Learning with SVM
Other Enhancements
- Add new loss function
FocalLoss
. @yongxinw (#2860) - Add matcher realtime inference support. @zhiqiangdon (#2613)
- Add matcher HPO. @zhiqiangdon (#2619)
- Add YOLOX models (small, large, and x-large) and update presets for object detection. @FANGAreNotGnu (#2644, #2867, #2927, #2933)
- Add AutoMM presets @zhiqiangdon. (#2620, #2749, #2839)
- Add model dump for models from HuggingFace, timm and mmdet. @suzhoum @FANGAreNotGnu @liangfu (#2682, #2700, #2737, #2840)
- Bug fix / refactor for NER. @cheungdaven (#2659, #2696, #2759, #2773)
- MultiModalPredictor import time reduction. @sxjscience (#2718)
Bug Fixes / Code and Doc Improvements
- NER example with visualization. @sxjscience (#2698)
- Bug fixes / Code and Doc Improvements. @sxjscience @tonyhoo @giswqs (#2708, #2714, #2739, #2782, #2787, #2857, #2818, #2858, #2859, #2891, #2918, #2940, #2906, #2907)
- Support of Label-Studio file export in AutoMM and added examples. @MountPOTATO (#2615)
- Added example of few-shot memory bank model with feature extraction based on Tip-adapter. @Linuxdex (#2822)
Deprecations
autogluon.vision
namespace is deprecated. @bryanyzhu (#2790, #2819, #2832)autogluon.text
namespace is deprecated. @sxjscience @Innixma (#2695, #2847)
Tabular
- TabularPredictor’s inference speed has been heavily optimized, with an average 250% speedup for real-time inference. This means that TabularPredictor can satisfy <10 ms end-to-end latency on many datasets when using
infer_limit
, and thehigh_quality
preset can satisfy <100 ms end-to-end latency on many datasets by default. - TabularPredictor’s
"multimodal"
hyperparameter preset now leverages the full capabilities of MultiModalPredictor, resulting in stronger performance on datasets containing a mix of tabular, image, and text features.
Performance Improvements
- Upgraded versions of all dependency packages to use the latest releases. @Innixma (#2823, #2829, #2834, #2887, #2915)
- Accelerated ensemble inference speed by 150% by removing TorchThreadManager context switching. @liangfu (#2472)
- Accelerated FastAI neural network inference speed by 100x+ and training speed by 10x on datasets with many features. @Innixma (#2909)
- (From 0.6.1) Avoid unnecessary DataFrame copies to accelerate feature preprocessing by 25%. @liangfu (#2532)
- (From 0.6.1) Refactor
NN_TORCH
model to be dataset iterable, leading to a 100% inference speedup. @liangfu (#2395) - MultiModalPredictor is now used as a member of the ensemble when
TabularPredictor.fit
is passedhyperparameters="multimodal"
. @Innixma (#2890)
API Enhancements
- Added
predict_multi
andpredict_proba_multi
methods toTabularPredictor
to efficiently get predictions from multiple models. @Innixma (#2727) - Allow label column to not be present in
leaderboard
calls when scoring is disabled. @Innixma (#2912)
Deprecations
- Added a deprecation warning when calling
predict_proba
withproblem_type="regression"
. This will raise an exception in a future release. @Innixma (#2684)
Bug Fixes / Doc Improvements
- Fixed incorrect time_limit estimation in
NN_TORCH
model. @Innixma (#2909) - Fixed error when fitting with only text features. @Innixma (#2705)
- Fixed error when
calibrate=True, use_bag_holdout=True
inTabularPredictor.fit
. @Innixma (#2715) - Fixed error when tuning
n_estimators
with RandomForest / ExtraTrees models. @Innixma (#2735) - Fixed missing onnxruntime dependency on Linux/MacOS when installing optional dependency
skl2onnx
. @liangfu (#2923) - Fixed edge-case RandomForest error on Windows. @yinweisu (#2851)
- Added improved logging for
refit_full
. @Innixma (#2913) - Added
compile_models
to the deployment tutorial. @liangfu (#2717) - Various internal code refactoring. @Innixma (#2744, #2887)
- Various doc and logging improvements. @Innixma (#2668)
autogluon.timeseries
New features
TimeSeriesPredictor
now supports past covariates (a.k.a.dynamic features or related time series which is not known for time steps to be predicted). @shchur (#2665, #2680)- New models from StatsForecast got introduced in
TimeSeriesPredictor
for various presets (medium_quality
,high_quality
andbest_quality
). @shchur (#2758) - Support missing value imputation for TimeSeriesDataFrame which allows users to customize filling logics for missing values and fill gaps in an irregular sampled times series. @shchur (#2781)
- Improve quantile forecasting performance of the AutoGluon-Tabular forecaster using the empirical noise distribution. @shchur (#2740)