🔩 Understanding the nuts and bolts of GPT

A short-straight-visualized notebook course to build your own GPT. Yes, from scratch 👣

🔤 Some background

Haven't expected 2023 will present the future which we dreamed since decades. I was grown up with the "toys" which could give some glimpse of how the AI looks like, which could generate real facial images if you've heard of thispersondoesnotexist.com and a couple of others.

But now, there is a hell, let me repeat a hell lot of generative tools out there popping up nearly every day. As a CS/DL student, it has always been a point to be curious about how the hack does this thing work.

Started with noobie questions:

Can the chatGPT remember things?
Oh, those are the weights, but how does it manage to create a perfect, grammatically correct sentence!?
It also takes care of the CaPitaliZation, proper indentation if asked for the code etc...

It had to be explored in great depth. And that's where I found Andrej Karpathy 🤘
And this notebook course takes you on the journey to the GPT starting from the very first baby steps.

😬 Means what? What does it include anyways?

The Karpathy's series is fantastic, but I love things to be spelled out even more, and a cherry on top, if visualized. I am willing to refer to some specific topic in the future, and if I can't find out that, seeking through the video, then I get frustrated.

This notebook course implements all code and math understanding buildup with informal explanation that we understand; the simplest language possible, you know the bro's language, yo!? 😆 and also we will keep building one step at a time 🧗

🖼 Vizzes, a lot of vizzes

🤓 How to read it?

I would encourage you to use this course as material for practising your understanding and a place to refer back to some specifics in the future but use Andrej's series as a main guide.

🤗 Each notebook is carefully crafted for to-the-point reading
➗ Darivations, expressions are explored, experimented and visualized in the dude's words. So, don't sweat off with the maths.
📔 Notebook has a lot of visuals to understand the flow, be it backpropagation, bug hunting or just visualizing the neural net.
👨‍💻 Code is there, code is there... and is reproducible.
📎 Clips to the specific portion of the lecture to refer to.

I mean, there's a lot there... just start exploring!!

🗺 Where to start?

Start with the very first lecture Micrograd.

Micrograd: Our first steps to the neural net, here we will explore the expressions at the atomic level and how slope or derivative is calculated. Will build a little classifier too.
Makemore: The first version of our GPT. But not the GPT exactly because doesn't use the transformer. But this Makemore is the term used to name the model. The model is capable of generating human names. We will start with the simple bigram model, no ML/DL has been done yet, just you know, the first stepping stone.
Makemore with NN: Here we will prove how the model can learn the relationship of the bigram model (manual model) built in the previous lecture.
Improve on NN: This is the core. Here we will see the scrutinizers. How to diagnose the model, what went wrong, batchnorm and other exciting stuff when the model isn't learning or... learning wrong.
Backprop Ninja: This will boost up your backpropagation confidence like hell. We will backpropagate through each layer, manually, spelled out, visualized, codized.
WaveNet: A new architecture to train the model.
GPT: Truely spelled out GPT. From the embeddings, to masked-multi-headed-self-attention. Visualized. Demystified. We will end up creating the GPT which actually completes the text together!

🦘 It's not ended; yet.

Andrej's last lecture is on GPT, but there is a lot to explore. So to be continued...

📰 As seen on

How Open Ai’s Andrej Karpathy Made One of the Best Tutorials in Deep Learning by - Usama Ahmed: He does a phenomenal job of reviewing Andrej's entire course, module by module. Must check out!

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
01 - Micrograd		01 - Micrograd
02 - Makemore		02 - Makemore
03 - Makemore with longer context window		03 - Makemore with longer context window
04 - Diagnosing the MLP		04 - Diagnosing the MLP
05 - Backprop ninja		05 - Backprop ninja
06 - Wavenet		06 - Wavenet
07 - GPT - Spelled Out		07 - GPT - Spelled Out
08 - Tokenizer		08 - Tokenizer
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

01 - Micrograd

01 - Micrograd

02 - Makemore

02 - Makemore

03 - Makemore with longer context window

03 - Makemore with longer context window

04 - Diagnosing the MLP

04 - Diagnosing the MLP

05 - Backprop ninja

05 - Backprop ninja

06 - Wavenet

06 - Wavenet

07 - GPT - Spelled Out

07 - GPT - Spelled Out

08 - Tokenizer

08 - Tokenizer

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

🔩 Understanding the nuts and bolts of GPT

🔤 Some background

😬 Means what? What does it include anyways?

🖼 Vizzes, a lot of vizzes

🤓 How to read it?

🗺 Where to start?

🦘 It's not ended; yet.

📰 As seen on

About

Releases

Packages

Languages

License

AayushSameerShah/Neural-Net-Zero-to-Hero-with-Andrej

Folders and files

Latest commit

History

Repository files navigation

🔩 Understanding the nuts and bolts of GPT

🔤 Some background

😬 Means what? What does it include anyways?

🖼 Vizzes, a lot of vizzes

🤓 How to read it?

🗺 Where to start?

🦘 It's not ended; yet.

📰 As seen on

About

Topics

Resources

License

Stars

Watchers

Forks

Languages