Skip to content

v1.0.0

Compare
Choose a tag to compare
@Innixma Innixma released this 30 Nov 05:56
· 172 commits to master since this release
caccb0b

Version 1.0.0

Today is finally the day... AutoGluon 1.0 has arrived!! After over four years of development and 2061 commits from 111 contributors, we are excited to share with you the culmination of our efforts to create and democratize the most powerful, easy to use, and feature rich automated machine learning system in the world.

AutoGluon 1.0 comes with transformative enhancements to predictive quality resulting from the combination of multiple novel ensembling innovations, spotlighted below. Besides performance enhancements, many other improvements have been made that are detailed in the individual module sections.

This release supports Python versions 3.8, 3.9, 3.10, and 3.11. Loading models trained on older versions of AutoGluon is not supported. Please re-train models using AutoGluon 1.0.

This release contains 223 commits from 17 contributors!

Full Contributor List (ordered by # of commits):

@shchur, @zhiqiangdon, @Innixma, @prateekdesai04, @FANGAreNotGnu, @yinweisu, @taoyang1122, @LennartPurucker, @Harry-zzh, @AnirudhDagar, @jaheba, @gradientsky, @melopeo, @ddelange, @tonyhoo, @canerturkmen, @suzhoum

Join the community:
Get the latest updates: Twitter

Spotlight

Tabular Performance Enhancements

AutoGluon 1.0 features major enhancements to predictive quality, establishing a new state-of-the-art in Tabular modeling. To the best of our knowledge, AutoGluon 1.0 marks the largest leap forward in the state-of-the-art for tabular data since the original AutoGluon paper from March 2020. The enhancements come primarily from two features: Dynamic stacking to mitigate stacked overfitting, and a new learned model hyperparameters portfolio via Zeroshot-HPO, obtained from the newly released TabRepo ensemble simulation library. Together, they lead to a 75% win-rate compared to AutoGluon 0.8 with faster inference speed, lower disk usage, and higher stability.

AutoML Benchmark Results

OpenML released the official 2023 AutoML Benchmark results on November 16th, 2023. Their results show AutoGluon 0.8 as the state-of-the-art in AutoML systems across a wide variety of tasks: "Overall, in terms of model performance, AutoGluon consistently has the highest average rank in our benchmark." We now showcase that AutoGluon 1.0 achieves far superior results even to AutoGluon 0.8!

Below is a comparison on the OpenML AutoML Benchmark across 1040 tasks. LightGBM, XGBoost, and CatBoost results were obtained via AutoGluon, and other methods are from the official AutoML Benchmark 2023 results. AutoGluon 1.0 has a 95%+ win-rate against traditional tabular models, including a 99% win-rate vs LightGBM and a 100% win-rate vs XGBoost. AutoGluon 1.0 has between an 82% and 94% win-rate against other AutoML systems. For all methods, AutoGluon is able to achieve >10% average loss improvement (Ex: Going from 90% accuracy to 91% accuracy is a 10% loss improvement). AutoGluon 1.0 achieves first place in 63% of tasks, with lightautoml having the second most at 12% (AutoGluon 0.8 previously took first place 48% of the time). AutoGluon 1.0 even achieves a 7.4% average loss improvement over AutoGluon 0.8!

Method AG Winrate AG Loss Improvement Rescaled Loss Rank Champion
AutoGluon 1.0 (Best, 4h8c) - - 0.04 1.95 63%
lightautoml (2023, 4h8c) 84% 12.0% 0.2 4.78 12%
H2OAutoML (2023, 4h8c) 94% 10.8% 0.17 4.98 1%
FLAML (2023, 4h8c) 86% 16.7% 0.23 5.29 5%
MLJAR (2023, 4h8c) 82% 23.0% 0.33 5.53 6%
autosklearn (2023, 4h8c) 91% 12.5% 0.22 6.07 4%
GAMA (2023, 4h8c) 86% 15.4% 0.28 6.13 5%
CatBoost (2023, 4h8c) 95% 18.2% 0.28 6.89 3%
TPOT (2023, 4h8c) 91% 23.1% 0.4 8.15 1%
LightGBM (2023, 4h8c) 99% 23.6% 0.4 8.95 0%
XGBoost (2023, 4h8c) 100% 24.1% 0.43 9.5 0%
RandomForest (2023, 4h8c) 97% 25.1% 0.53 9.78 1%

Not only is AutoGluon more accurate in 1.0, it is also more stable thanks to our new usage of Ray subprocesses during low-memory training, resulting in 0 task failures on the AutoML Benchmark.

AutoGluon 1.0 is capable of achieving the fastest inference throughput of any AutoML system while still obtaining state-of-the-art results. By specifying the infer_limit fit argument, users can trade off between accuracy and inference speed to meet their needs.

As seen in the below plot, AutoGluon 1.0 sets the Pareto Frontier for quality and inference throughput, achieving Pareto Dominance compared to all other AutoML systems. AutoGluon 1.0 High achieves superior performance to AutoGluon 0.8 Best with 8x faster inference and 8x less disk usage!

AutoGluon 1.0 AutoML Benchmark Plot

You can get more details on the results here.

We are excited to see what our users can accomplish with AutoGluon 1.0's enhanced performance.
As always, we will continue to improve AutoGluon in future releases to push the boundaries of AutoML forward for all.

AutoGluon Multimodal (AutoMM) Highlights in One Figure

AutoMM highlights

AutoMM Uniqueness

AutoGluon Multimodal (AutoMM) distinguishes itself from other open-source AutoML toolboxes like AutosSklearn, LightAutoML, H2OAutoML, FLAML, MLJAR, TPOT and GAMA, which mainly focus on tabular data for classification or regression. AutoMM is designed for fine-tuning foundation models across multiple modalitiesβ€”image, text, tabular, and document, either individually or combined. It offers extensive capabilities for tasks like classification, regression, object detection, named entity recognition, semantic matching, and image segmentation. In contrast, other AutoML systems generally have limited support for image or text, typically using a few pretrained models like EfficientNet or hand-crafted rules like bag-of-words as feature extractors. AutoMM provides a uniquely comprehensive and versatile approach to AutoML, being the only AutoML system to support flexible multimodality and support for a wide range of tasks. A comparative table detailing support for various data modalities, tasks, and model types is provided below.

Data Task Model
image text tabular document any combination classification regression object detection semantic matching named entity recognition image segmentation traditional models deep learning models foundation models
LightAutoML βœ“ βœ“ βœ“ βœ“ βœ“ βœ“ βœ“
H2OAutoML βœ“ βœ“ βœ“ βœ“
FLAML βœ“ βœ“ βœ“ βœ“ βœ“ βœ“ βœ“
MLJAR βœ“ βœ“ βœ“ βœ“
AutoSklearn βœ“ βœ“ βœ“ βœ“ βœ“
GAMA βœ“ βœ“ βœ“ βœ“
TPOT βœ“ βœ“ βœ“ βœ“ βœ“ βœ“
AutoMM βœ“ βœ“ βœ“ βœ“ βœ“ βœ“ βœ“ βœ“ βœ“ βœ“ βœ“ βœ“ βœ“

Special Thanks

We would like to conclude this spotlight by thanking Pieter Gijsbers, SΓ©bastien Poirier, Erin LeDell, Joaquin Vanschoren, and the rest of the AutoML Benchmark authors for their key role in providing a shared and extensive benchmark to monitor the progress of the AutoML field. Their support has been invaluable to the AutoGluon project's continued growth.

We would also like to thank Frank Hutter, who continues to be a leader within the AutoML field, for organizing the AutoML Conference in 2022 and 2023 to bring the community together to share ideas and align on a compelling vision.

Finally, we would like to thank Alex Smola and Mu Li for championing open source software at Amazon to make this project possible.

Additional Special Thanks

  • Special thanks to @LennartPurucker for leading development of dynamic stacking
  • Special thanks to @geoalgo for co-authoring TabRepo to enable Zeroshot-HPO
  • Special thanks to @ddelange for helping to add Python 3.11 support
  • Special thanks to @mglowacki100 for providing numerous feedback and suggestions
  • Special thanks to @Harry-zzh for contributing the new semantic segmentation problem type

General

Highlights

Other Enhancements

Dependency Updates

Tabular

Highlights

AutoGluon 1.0 features major enhancements to predictive quality, establishing a new state-of-the-art in Tabular modeling. Refer to the spotlight section above for more details!

New Features

Performance Improvements

Other Enhancements

Bug Fixes / Code and Doc Improvements

AutoMM

AutoGluon Multimodal (AutoMM) is designed to simplify the fine-tuning of foundation models for downstream applications with just three lines of code. It seamlessly integrates with popular model zoos such as HuggingFace Transformers, TIMM, and MMDetection, providing support for a diverse range of data modalities,
including image, text, tabular, and document data, whether used individually or in combination.

New Features

Performance Improvements

  • Improved default image backbones, achieving a 100% win-rate on the image benchmark. @taoyang1122 (#3738)
  • Replaced MLPs with FT-Transformer as the default tabular backbones, resulting in a 67% win-rate on the text+tabular benchmark. @taoyang1122 (#3732)
  • Using both the improved default image backbones and FT-Transformer achieves a 62% win-rate on the text+tabular+image benchmark. @taoyang1122 (#3732, #3738)

Stability Enhancements

Enhanced Usability

Improved Scalability

  • The introduction of the new learner class design facilitates easier support for new tasks and data modalities within AutoMM, enhancing overall scalability. @zhiqiangdon (#3650, #3685, #3735)

Other Enhancements

Code Improvements

Bug Fixes/Doc Improvements

TimeSeries

Highlights

AutoGluon 1.0 features numerous usability and performance improvements to the TimeSeries module. These include automatic handling of missing data and irregular time series, new forecasting metrics (including custom metric support), advanced time series cross-validation options, and new forecasting models. AutoGluon produces state-of-the-art results in forecast accuracy, achieving 70%+ win rate compared to other popular forecasting frameworks.

New features

  • Support for custom forecasting metrics @shchur (#3760, #3602)
  • New forecasting metrics WAPE, RMSSE, SQL + improved documentation for metrics @melopeo @shchur (#3747, #3632, #3510, #3490)
  • Improved robustness: TimeSeriesPredictor can now handle data with all pandas frequencies, irregular timestamps, or missing values represented by NaN @shchur (#3563, #3454)
  • New models: intermittent demand forecasting models based on conformal prediction (ADIDA, CrostonClassic, CrostonOptimized, CrostonSBA, IMAPA); WaveNet and NPTS from GluonTS; new baseline models (Average, SeasonalAverage, Zero) @canerturkmen @shchur (#3706, #3742, #3606, #3459)
  • Advanced cross-validation options: avoid retraining the models for each validation window with refit_every_n_windows or adjust the step size between validation windows with val_step_size arguments to TimeSeriesPredictor.fit @shchur (#3704, #3537)

Enhancements

  • Enable Ray Tune for deep-learning forecasting models @canerturkmen (#3705)
  • Support passing multiple evaluation metrics to TimeSeriesPredictor.evaluate @shchur (#3646)
  • Static features can now be passed directly to TimeSeriesDataFrame.from_path and TimeSeriesDataFrame.from_data_frame constructors @shchur (#3635)

Performance improvements

  • Much more accurate forecasts at low time limits thanks to new presets and updated logic for splitting the training time across models @shchur (#3749, #3657, #3741)
  • Faster training and prediction + lower memory usage for DirectTabular and RecursiveTabular models (#3740, #3620, #3559)
  • Enable early stopping and improve inference speed for GluonTS models @shchur (#3575)
  • Reduce import time for autogluon.timeseries by moving import statements inside model classes (#3514)

Bug Fixes / Code and Doc Improvements

EDA

The EDA module will be released at a later time, as it requires additional development effort before it is ready for 1.0.
We will make an announcement when EDA is ready for release. For now, please continue to use "autogluon.eda==0.8.2".

Deprecations

General

  • autogluon.core.spaces has been deprecated. Please use autogluon.common.spaces instead @Innixma (#3701)

Tabular

Tabular will log warnings if using the deprecated methods. Deprecated methods are planned to be removed in AutoGluon 1.2 @Innixma (#3701)

  • autogluon.tabular.TabularPredictor
    • predictor.get_model_names() -> predictor.model_names()
    • predictor.get_model_names_persisted() -> predictor.model_names(persisted=True)
    • predictor.compile_models() -> predictor.compile()
    • predictor.persist_models() -> predictor.persist()
    • predictor.unpersist_models() -> predictor.unpersist()
    • predictor.get_model_best() -> predictor.model_best
    • predictor.get_pred_from_proba() -> predictor.predict_from_proba()
    • predictor.get_oof_pred_proba() -> predictor.predict_proba_oof()
    • predictor.get_oof_pred() -> predictor.predict_oof()
    • predictor.get_model_full_dict() -> predictor.model_refit_map()
    • predictor.get_size_disk() -> predictor.disk_usage()
    • predictor.get_size_disk_per_file() -> predictor.disk_usage_per_file()
    • predictor.leaderboard() silent argument deprecated, replaced by display, defaults to False
      • Same for predictor.evaluate() and predictor.evaluate_predictions()

AutoMM

  • Deprecated the FewShotSVMPredictor in favor of the new few_shot_classification problem type @zhiqiangdon (#3699)
  • Deprecated the AutoMMPredictor in favor of MultiModalPredictor @zhiqiangdon (#3650)
  • autogluon.multimodal.MultiModalPredictor

TimeSeries

  • autogluon.timeseries.TimeSeriesPredictor
    • Deprecated argument TimeSeriesPredictor(ignore_time_index: bool). Now, if the data contains irregular timestamps, either convert it to regular frequency with data = data.convert_frequency(freq) or provide frequency when creating the predictor as TimeSeriesPredictor(freq=freq).
    • predictor.evaluate() now returns a dictionary (previously returned a float)
    • predictor.score() -> predictor.evaluate()
    • predictor.get_model_names() -> predictor.model_names()
    • predictor.get_model_best() -> predictor.model_best
    • Metric "mean_wQuantileLoss" has been renamed to "WQL"
    • predictor.leaderboard() silent argument deprecated, replaced by display, defaults to False
    • When setting hyperparameters to a string in predictor.fit(), supported values are now "default", "light" and "very_light"
  • autogluon.timeseries.TimeSeriesDataFrame
    • df.to_regular_index() -> df.convert_frequency()
    • Deprecated method df.get_reindexed_view(). Please see deprecation notes for ignore_time_index under TimeSeriesPredictor above for information on how to deal with irregular timestamps
  • Models
    • All models based on MXNet (DeepARMXNet, MQCNNMXNet, MQRNNMXNet, SimpleFeedForwardMXNet, TemporalFusionTransformerMXNet, TransformerMXNet) have been removed
    • Statistical models from Statmodels (ARIMA, Theta, ETS) have been replaced by their counterparts from StatsForecast (#3513). Note that these models now have different hyperparameter names.
    • DirectTabular is now implemented using mlforecast backend (same as RecursiveTabular), most hyperparameter names for the model have changed.
  • autogluon.timeseries.TimeSeriesEvaluator has been deprecated. Please use metrics available in autogluon.timeseries.metrics instead.
  • autogluon.timeseries.splitter.MultiWindowSplitter and autogluon.timeseries.splitter.LastWindowSplitter have been deprecated. Please use num_val_windows and val_step_size arguments to TimeSeriesPredictor.fit instead (alternatively, use autogluon.timeseries.splitter.ExpandingWindowSplitter).

Papers

AutoGluon-TimeSeries: AutoML for Probabilistic Time Series Forecasting

We have published a paper on AutoGluon-TimeSeries at AutoML Conference 2023 (Paper Link, YouTube Video). In the paper, we benchmarked AutoGluon and popular open-source forecasting frameworks (including DeepAR, TFT, AutoARIMA, AutoETS, AutoPyTorch). AutoGluon produces SOTA results in point and probabilistic forecasting, and even achieves 65% win rate against the best-in-hindsight combination of models.

TabRepo: A Large Scale Repository of Tabular Model Evaluations and its AutoML Applications

We have published a paper on Tabular Zeroshot-HPO ensembling simulation to arXiv (Paper Link, GitHub). This paper is key to achieving the performance improvements seen in AutoGluon 1.0, and we plan to continue to develop the code-base to support future enhancements.

XTab: Cross-table Pretraining for Tabular Transformers

We have published a paper on tabular Transformer pre-training at ICML 2023 (Paper Link, GitHub). In the paper we demonstrate state-of-the-art performance for tabular deep learning models, including being able to match the performance of XGBoost and LightGBM models. While the pre-trained transformer is not yet incorporated into AutoGluon, we plan to integrate it in a future release.

Learning Multimodal Data Augmentation in Feature Space

Our paper on learning multimodal data augmentation was accepted at ICLR 2023 (Paper Link, GitHub). This paper introduces a plug-and-play module to learn multimodal data augmentation in feature space, with no constraints on the identities of the modalities or the relationship between modalities. We show that it can (1) improve the performance of multimodal deep learning architectures, (2) apply to combinations of modalities that have not been previously considered, and (3) achieve state-of-the-art results on a wide range of applications comprised of image, text, and tabular data. This work is not yet incorporated into AutoGluon, but we plan to integrate it in a future release.

Data Augmentation for Object Detection via Controllable Diffusion Models

Our paper on generative object detection data augmentation has been accepted at WACV 2024 (Paper and GitHub link will be available soon). This paper proposes a data augmentation pipeline based on controllable diffusion models and CLIP, with visual prior generation to guide the generation and post-filtering by category-calibrated CLIP scores to control its quality. We demonstrate that the performance improves across various tasks and settings when using our augmentation pipeline with different detectors. Although diffusion models are currently not integrated into AutoGluon, we plan to incorporate the data augmentation techniques in a future release.

Adapting Image Foundation Models for Video Understanding

We have published a paper on how to efficiently adapt image foundation models for video understanding at ICLR 2023 (Paper Link, GitHub). This paper introduces spatial adaptation, temporal adaptation and joint adaptation to gradually equip a frozen image model with spatiotemporal reasoning capability. The proposed method achieves competitive or even better performance than traditional full finetuning while largely saving the training cost of large foundation models.