GitHub - eminorhan/tae: A simple transformer-based autoencoder model

Transformer Autoencoder (TAE)

A simple transformer-based autoencoder model.

Encoder and decoder are both vanilla ViT models.
The skeleton of the code is recycled from Facebook's MAE repository with several simplifications.
Work in progress.

Why transformer-based autoencoders?

Better representational alignment with transformer models used in downstream tasks, e.g. diffusion transformers.
Trading off embedding dimensionality for much reduced spatial size, e.g. being able to train diffusion transformers with a 4x4 spatial grid = 16 spatial tokens (this can in principle be done with convnet-based autoencoders too, but is more natural and convenient with transformers). In transformers, complexity scales quadratically with the number of spatial tokens and linearly with dimensionality, so this trade-off leads to more compute efficient models. It also opens the door to training models on massively larger images/videos.
Current "first stage models" used for image/video compression are too complicated, e.g. using adversarial losses (among others). I'd like to simplify this process by showing simple plain autoencoders are performant as first stage models.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
outputs		outputs
scripts		scripts
tests		tests
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
encode.py		encode.py
evaluate.py		evaluate.py
tae.py		tae.py
train.py		train.py
train_recognition.py		train_recognition.py
train_recognition_noncached.py		train_recognition_noncached.py
train_recognition_noncached_without_eval.py		train_recognition_noncached_without_eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

outputs

outputs

scripts

scripts

tests

tests

util

util

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

encode.py

encode.py

evaluate.py

evaluate.py

tae.py

tae.py

train.py

train.py

train_recognition.py

train_recognition.py

train_recognition_noncached.py

train_recognition_noncached.py

train_recognition_noncached_without_eval.py

train_recognition_noncached_without_eval.py

Repository files navigation

Transformer Autoencoder (TAE)

Why transformer-based autoencoders?

About

Releases

Packages

Languages

License

eminorhan/tae

Folders and files

Latest commit

History

Repository files navigation

Transformer Autoencoder (TAE)

Why transformer-based autoencoders?

About

Topics

Resources

License

Stars

Watchers

Forks

Languages