Helpful scripts for training ByT5-type models
- small_from_wiki: pretraining byt5-small from scratch on a Wikipedia dataset from TFDS, using Google's TensorFlow repo
- base_from_text: pretraining byt5-base from scratch on a local text file, using Google's TensorFlow repo
- small_hf_flax_from_dataset: pretraining byt5-small from scratch on a dataset from HuggingFace's Flax example script
- simplet5_finetuning: fine-tuning byt5-small and byt5-basque with SimpleT5 (PR pending)