SWAG transformers

This repository provides functionality for Stochastic Weight Averaging-Gaussian training for Transformer models. The implementation is tied into two libraries:

transformers (maintained by Hugging Face)
swa_gaussian (maintained by the Language Technology Research Group at the University of Helsinki)

The goal is to make an implementation that works directly with the convenience tools in the transformers library (e.g. Pipeline and Trainer) as well as evaluator from the related evaluate library.

Usage

Fine-tuning

BERT model, sequence classification task:

Load pretrained Bert model by base_model = AutoModelForSequenceClassification.from_pretrained(name_or_path)
Initialize SWAG model by swag_model = SwagBertForSequenceClassification.from_base(base_model)
Initialize SWAG callback object swag_callback = SwagUpdateCallback(swag_model)
Initialize transformers.Trainer with the base_model as model and swag_callback in callbacks.
Train the model (trainer.train())
Store the complete model using swag_model.save_pretrained(path)

Currently supported models

BERT
- BertPreTrainedModel -> SwagBertPreTrainedModel
- BertModel -> SwagBertModel
- BertForSequenceClassification -> SwagBertForSequenceClassification

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
examples		examples
src/swag_transformers		src/swag_transformers
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

src/swag_transformers

src/swag_transformers

tests

tests

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

pyproject.toml

pyproject.toml

setup.py

setup.py

Repository files navigation

SWAG transformers

Usage

Fine-tuning

Currently supported models

About

Releases

Packages

Contributors 2

Languages

License

Helsinki-NLP/swag_transformers

Folders and files

Latest commit

History

Repository files navigation

SWAG transformers

Usage

Fine-tuning

Currently supported models

About

Resources

License

Stars

Watchers

Forks

Languages