GitHub

Generative GPT Model

Attention Is All You Need, Vaswani et Al., Google Brain, 2017.

Improving Language Understanding by Generative Pre-Training, Radford et Al., OpenAI, 2018

Language Models are Unsupervised Multitask Learners, Radford et Al., OpenAI, 2019

Original GPT model

12 transformer layers
Embedding dimension of 768
12 attention heads
Feed forward dimension of 3072
Adam optimizer with max learning rate of 2.5e-4
Trained for 100 epochs on minibatches of 64 randomly sampled, contiguous sequences of 512 tokens
Gelu activation function

Attention is All You Need model

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.ipynb_checkpoints		.ipynb_checkpoints
reddit_data		reddit_data
README.md		README.md
Reddit Scraper and Transformer Chatbot.ipynb		Reddit Scraper and Transformer Chatbot.ipynb
Reddit_Utils.py		Reddit_Utils.py