Skip to content
This repository has been archived by the owner on Jan 23, 2024. It is now read-only.
/ hebrew-gpt_neo Public archive

Hebrew text generation models based on EleutherAI's gpt-neo. Each was trained on a TPUv3-8 made avilable via TPU Research Cloud Program.

License

Notifications You must be signed in to change notification settings

Norod/hebrew-gpt_neo

Repository files navigation

hebrew-gpt_neo

Hebrew text generation models based on EleutherAI's gpt-neo. Each was trained on a TPUv3-8 which was made avilable to me via the TPU Research Cloud Program.

JS Colab notebook Open in Google Colab

Gradio Colab notebook Open in Google Colab

Datasets

  1. An assortment of various Hebrew corpuses - I have made it available here

  2. oscar / unshuffled_deduplicated_he - Homepage | Dataset Permalink

The Open Super-large Crawled ALMAnaCH coRpus is a huge multilingual corpus obtained by language classification and filtering of the Common Crawl corpus using the goclassy architecture.

Models

hebrew-gpt_neo-xl

hebrew-gpt_neo-small

hebrew-gpt_neo-tiny