You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add prepro script to download and preprocess and tokenize WikiText-103 just like tiny shakespeare / tiny stories, following this repo. Adapt the mainline training script train_gpt2.cu to report the validation performance on this set.
Add python code that does the same, evaluates on WikiText-103, and reports performance for all the GPT-2 model sizes. This is our baseline to reach, training from scratch init.
Optionally help research other ways that people have evaluated GPT-2 models, or attempted to reproduce them in the past.
The text was updated successfully, but these errors were encountered:
We are abandoning WikiText103 because it's a total mess. We'll instead look at one/few of ARC Easy / Challenge, Squad, Hellaswag, TriviaQA, LAMBADA. Closing.
I've seen some repos use WikiText-103 as the dataset they use to eval GPT-like models, e.g.:
https://github.com/tysam-code/hlb-gpt/tree/main
Add prepro script to download and preprocess and tokenize WikiText-103 just like tiny shakespeare / tiny stories, following this repo. Adapt the mainline training script
train_gpt2.cu
to report the validation performance on this set.Add python code that does the same, evaluates on WikiText-103, and reports performance for all the GPT-2 model sizes. This is our baseline to reach, training from scratch init.
Optionally help research other ways that people have evaluated GPT-2 models, or attempted to reproduce them in the past.
The text was updated successfully, but these errors were encountered: