Transformers and Language Models

This purpose of this repository is to act as an entry-point to the world of transformer-based language modelling in PyTorch. It is pitched to those of us that want to understand implementation details and grasp theoretical insights, without having to wade through badly written research code 🙂

All code has been structured into a Python package called modelling, that is organised as follows:

└── src
    ├── modelling
    │   ├── data.py
    │   ├── rnn.py
    │   ├── transformer.py
    │   └── utils.py

We have done our best to make this as readable as possible and comprehensively documented, so this is the place to go for the implementation details. To see this in action, use the following notebooks:

notebooks
├── 0_attention_and_transformers.ipynb
├── 1_datasets_and_dataloaders.ipynb
├── 2_text_generation_with_rnns.ipynb
└── 3_text_generation_with_transformers.ipynb
└── 4_pre_trained_transformers_for_search.ipynb
└── 5_pre_trained_transformers_for_sentiment_analysis.ipynb

These will guide you through steps required to use the code contaied within the modelling package to train a language model and then use it to perform semantic search and sentiment classification tasks.

Installing

To run the notebooks and use the code within the src/modelling directory either clone this repository and install the package directly from the source code,

pip install .

Or install it directly from this repository,

pip install git+https://github.com/AlexIoannides/transformers.git@main

Presentation Slides

An HTML (and PDF) presentation of this work is contained in the presentation_slides directory.

Useful Resources

We found the following useful in our ascent up the transformer and LLMs learning curve:

The Annotated Transformer - Attention is all you Need, the paper that introduced the transformer architecture for sequence-to-sequence modelling, annotated with PyTorch code snippets that demonstrate how to implement the concepts from first principles.
Transformers and Multi-Head Attention - comprehensive tutorial from Lightning AI that demonstrates how to compose and train a simple generative language model using the latest techniques for training transformer models.
Language Modelling with nn.Transformer and torchtext - a tutorial from PyTorch that demonstrates how to use PyTorch's transformer layers to train a simple generative language model.
Transformer Architecture: The Positional Encoding - a deep-dive into positional encoding and its role transformer models.

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
notebooks		notebooks
presentation_slides		presentation_slides
src/modelling		src/modelling
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notebooks

notebooks

presentation_slides

presentation_slides

src/modelling

src/modelling

tests

tests

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

pyproject.toml

pyproject.toml

Repository files navigation

Transformers and Language Models

Installing

Presentation Slides

Useful Resources

About

Releases

Languages

License

AlexIoannides/transformers-gen-ai

Folders and files

Latest commit

History

Repository files navigation

Transformers and Language Models

Installing

Presentation Slides

Useful Resources

About

Topics

Resources

License

Stars

Watchers

Forks

Languages