SimplifiedTransformers

The author presents an implementation for Simplifying Transformer Blocks. The standard transformer blocks are complex and can lead to architecture instability. In this work, the author investigates how the standard transformer block can be simplified. Through signal propagation theory and empirical observations, the author proposes modifications that remove several components without sacrificing training speed or performance. The simplified transformers achieve the same training speed and performance as standard transformers, while being 15% faster in training throughput and using 15% fewer parameters.

Install

pip3 install --upgrade simplified-transormer-torch

Usage

import torch
from simplified_transformers.main import SimplifiedTransformers

model = SimplifiedTransformers(
    dim=4096,
    depth=6,
    heads=8,
    num_tokens=20000,
)

x = torch.randint(0, 20000, (1, 4096))

out = model(x)
print(out.shape)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github		.github
.vscode		.vscode
simplified_transformers		simplified_transformers
.DS_Store		.DS_Store
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
agorabanner.png		agorabanner.png
example.py		example.py
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

License

kyegomez/SimplifiedTransformers

Folders and files

Latest commit

History

Repository files navigation

SimplifiedTransformers

Install

Usage

License

About

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

Languages