GPT-from-Scratch

Multi-head attention (created with DALL.E 3)

A step-by-step derivation and implementation of the GPT architecture from scratch, following the original paper on GPT: Improving Language Understanding by Generative Pre-Training (Radford et al. 2018) and the transformer model: Attention is All You Need (Vaswani et al. 2017). This is mostly a personal exercise to deepen my understanding on multi-head self-attention, transformer, causal languaging modelling and unsupervised pretraining, but can also serve as a guide for anyone interested to derive the GPT architecture from first principle.

Dependencies

PyTorch>=2.1.0

Usage

The complete derivation walkthrough is on the Jupyter notebook derive-gpt-from-scratch.ipynb.

At the end of the walkthrough, we will get a GPT model that can write Shakespeare-style plays (or gibberish).

Acknowledgments

This project references the following resources:

License

This project is licensed under the MIT License. Please see the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
babyGPT_20000_steps.pt		babyGPT_20000_steps.pt
derive-gpt-from-scratch.ipynb		derive-gpt-from-scratch.ipynb
reading_medusa.jpg		reading_medusa.jpg
shakespeare.txt		shakespeare.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

babyGPT_20000_steps.pt

babyGPT_20000_steps.pt

derive-gpt-from-scratch.ipynb

derive-gpt-from-scratch.ipynb

reading_medusa.jpg

reading_medusa.jpg

shakespeare.txt

shakespeare.txt

Repository files navigation

GPT-from-Scratch

Dependencies

Usage

Acknowledgments

License

About

Releases

Packages

Languages

License

LaurenceLungo/GPT-from-Scratch

Folders and files

Latest commit

History

Repository files navigation

GPT-from-Scratch

Dependencies

Usage

Acknowledgments

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages