Microsoft GreenAI: NLP Text Summarization (preview)

This repo currently contains samples to fine-tune HuggingFace models for text summarization using Microsoft's Azure Machine Learning service. These samples could be adapted to fine-tune models for other NLP tasks or product scenarios.

What's available now?

AzureML v2 CLI examples for fine-tuning HuggingFace models
Quickstart ARM Templates for fine-tuning HuggingFace models
Fine-tuned HuggingFace models & results: https://huggingface.co/linydub

What's coming next?

Benchmarking and carbon accounting with MLflow and Azure Monitor Metrics (performance + resource metrics)
Interactive data visualization example with Azure Monitor Workbook
AML v2 CLI inference samples with ONNX Runtime and NVIDIA Triton (AML endpoint & deployment)
AML v2 CLI end-to-end pipeline samples
Repository documentation and detailed guide for the samples
More fine-tuned models and benchmark results

*More details about the project and future plans could be found here.

Fine-tuning Samples

These samples showcase various methods to fine-tune HuggingFace models using AzureML. All of the samples include DeepSpeed, FairScale, CodeCarbon, MLflow integrations with no additional setup or code.

All logged training metrics are automatically reported to AzureML and MLflow. CodeCarbon also generates a emissions.csv file by default inside the outputs folder of the submitted run. To disable a package, ommit it from the environment's conda file.

*Sample script for retrieving and aggregating MLflow and resource usage data will be available next update.

Quickstart

Fine-tune a HuggingFace Model

Fine-tune with DeepSpeed ZeRO Optimizations

Hyperparameter Sweep with HyperDrive

More advanced ARM Templates will be available here.

AzureML v2 CLI Examples

Fine-tuning samples using AML 2.0 CLI could be found here.

Inference Samples

Jupyter Notebooks

Notebook	Description

Support/Feedback

Please file an issue through the repo or email me at liny62@uw.edu. Feedback is greatly appreciated 🤗

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github/workflows		.github/workflows
cloud		cloud
docs		docs
examples		examples
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md

Directory	Description
`cloud`	Cloud-specific configuration code
`docs`	Project docs & images
`examples`	AzureML examples for sample tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

cloud

cloud

docs

docs

examples

examples

.gitignore

.gitignore

CODE_OF_CONDUCT.md

CODE_OF_CONDUCT.md

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Microsoft GreenAI: NLP Text Summarization (preview)

What's available now?

What's coming next?

Contents

Fine-tuning Samples

Quickstart

Fine-tune a HuggingFace Model

Fine-tune with DeepSpeed ZeRO Optimizations

Hyperparameter Sweep with HyperDrive

AzureML v2 CLI Examples

Inference Samples

Jupyter Notebooks

Support/Feedback

About

License

linydub/azureml-greenai-txtsum

Folders and files

Latest commit

History

Repository files navigation

Microsoft GreenAI: NLP Text Summarization (preview)

What's available now?

What's coming next?

Contents

Fine-tuning Samples

Quickstart

Fine-tune a HuggingFace Model

Fine-tune with DeepSpeed ZeRO Optimizations

Hyperparameter Sweep with HyperDrive

AzureML v2 CLI Examples

Inference Samples

Jupyter Notebooks

Support/Feedback

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks