Skip to content

Latest commit

 

History

History
98 lines (98 loc) · 19.5 KB

large model landscape.md

File metadata and controls

98 lines (98 loc) · 19.5 KB
Category component owner Closed source or property OSS License Commercial use modle size(B) release date code/paper Star Description
Multi-Model ImageBind Meta License No Github 5.9k ImageBind One Embedding Space to Bind Them All
Image DeepFloyd IF stability.ai License

Model license
Github 6.4k text-to-image model with a high degree of photorealism and language understanding
Image Stable Diffusion Version 2 stability.ai MIT, unknown Github 23.5k High-Resolution Image Synthesis with Latent Diffusion Models
Image DALL-E OpenAI Modified MIT Yes Github 10.3k PyTorch package for the discrete VAE used for DALL·E.
Image DALL·E 2 OpenAI Yes product
Image DALLE2-pytorch lucidrains MIT Yes Github 9.7k Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
Speech Whisper OpenAI MIT Yes Github 37.7k Robust Speech Recognition via Large-Scale Weak Supervision
Speech MMS Meta Yes paper
Code model Codex OpenAI Yes 12 2021/7/1 blog Paper
Code model AlphaCode 41 Feb 2022 Competition-Level Code Generation with AlphaCode
Code model starcoder BigCode No Apache 15 May 2023 Github 4.8k language model (LM) trained on source code and natural language text
Code model CodeGen Salesforce No ? Github 3.6k model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
Code model Replit Code replit 3 May 2023 replit-code-v1-3b model is a 2.7B LLM trained on 20 languages from the Stack Dedup v1.2 dataset.
Code model CodeGen2 Salesforce BSD Yes 1, 3, 7, 16 May 2023 Github Code models for program synthesis.
Code model CodeT5 and CodeT5+ Salesforce BSD Yes 16 May 2023 CodeT5 CodeT5 and CodeT5+ models for Code Understanding and Generation from Salesforce Research.
language model GPT June 2018 GPT Improving Language Understanding by Generative Pre-Training
language model BERT Oct 2018 BERT Bidirectional Encoder Representations from Transformers
language model RoBERTa 0.125 - 0.355 July 2019 RoBERTa A Robustly Optimized BERT Pretraining Approach
language model GPT-2 1.5 Nov 2019 GPT-2 Language Models are Unsupervised Multitask Learners
language model T5 0.06 - 11 Oct 2019 Flan-T5 Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
language model XLNet Jun 2019 XLNet Generalized Autoregressive Pretraining for Language Understanding and Generation
language model ALBERT 0.235 Sep 2019 ALBERT A Lite BERT for Self-supervised Learning of Language Representations
language model CTRL 1.63 Sep 2019 CTRL CTRL: A Conditional Transformer Language Model for Controllable Generation
language model GPT 3 Azure Yes 175 May 2020 Paper Language Models are Few-Shot Learners
language model GShard 600 Jun 2020 Paper GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
language model BART Jul 2020 BART Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
language model mT5 13 Oct 2020 mT5 mT5: A massively multilingual pre-trained text-to-text transformer
language model PanGu-α 13 April 2021 PanGu-α PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
language model CPM-2 198 Jun 2021 CPM CPM-2: Large-scale Cost-effective Pre-trained Language Models
language model GPT-J 6B EleutherAI No Yes 6 June 2021 GPT-J-6B A 6 billion parameter, autoregressive text generation model trained on The Pile.
language model ERNIE 3.0 Baidu Yes 10 July 2021 ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
language model Jurassic-1 178 Aug 2021 Jurassic-1: Technical Details and Evaluation
language model ERNIE 3.0 Titan 10 July 2021 ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
language model HyperCLOVA 82 Sep 2021 What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
language model FLAN 137 2021/10/1 Paper Finetuned Language Models Are Zero-Shot Learners
language model GPT 3.5 Azure Yes
language model GPT 4 Azure Yes 2023/3/1
language model ERNIE 3.0 Baidu Yes 10 2021/7/1 Paper
language model Jurassic-1 178 2021/8/1 Paper
language model T0 11 Oct 2021 T0 Multitask Prompted Training Enables Zero-Shot Task Generalization
language model Yuan 1.0 245 Oct 2021 Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
language model WebGPT 175 Dec 2021 WebGPT: Browser-assisted question-answering with human feedback
language model Gopher 280 Dec 2021 Scaling Language Models: Methods, Analysis & Insights from Training Gopher
language model GLaM 1200 Dec 2021 GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
language model LaMDA Bard Yes 137 Jan 2022 Paper LaMDA: Language Models for Dialog Applications
language model MT-NLG 530 Jan 2022 Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
language model InstructGPT 175 Mar 2022 Training language models to follow instructions with human feedback
language model Chinchilla 70 Mar 2022 Shows that for a compute budget, the best performances are not achieved by the largest models but by smaller models trained on more data.
language model GPT-NeoX-20B 20 April 2022 GPT-NeoX-20B GPT-NeoX-20B: An Open-Source Autoregressive Language Model
language model Tk-Instruct 11 April 2022 Tk-Instruct-11B Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
language model PALM Google Yes 540 April 2022 PaLM: Scaling Language Modeling with Pathways
language model OPT Meta No Yes 175 May 2022 OPT-13BOPT-66B ,Paper OPT: Open Pre-trained Transformer Language Models
language model OPT-IML 30, 175 Dec 2022 OPT-IML OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
language model GLM-130B 130 Oct 2022 GLM-130B GLM-130B: An Open Bilingual Pre-trained Model
language model AlexaTM 20 Aug 2022 AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
language model Flan-T5 11 Oct 2022 Flan-T5-xxl Scaling Instruction-Finetuned Language Models
language model Sparrow 70 Sep 2022 Improving alignment of dialogue agents via targeted human judgements
language model UL2 20 Oct 2022 UL2, Flan-UL2 UL2: Unifying Language Learning Paradigms
language model U-PaLM 540 Oct 2022 Transcending Scaling Laws with 0.1% Extra Compute
language model BLOOM BigScience Bo Yes 176 Nov 2022 BLOOM ,Paper BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
language model mT0 13 Nov 2022 mT0-xxl Crosslingual Generalization through Multitask Finetuning
language model Galactica 0.125 - 120 Nov 2022 Galactica Galactica: A Large Language Model for Science
language model ChatGPT Nov 2022 A model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests.
language model LLama Meta No No 7, 13, 33, 65 2023/2/1 Paper,LLaMA LLaMA: Open and Efficient Foundation Language Models
language model GPT-4 March 2023
language model PanGU-Σ Yes 1085 2023/3/1 PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing
language model BloombergGPT 50 March 2023 BloombergGPT: A Large Language Model for Finance
language model Cerebras-GPT Cerebras No Yes 0.111 - 13 2023/3 hf Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster
language model oasst-sft-1-pythia-12b LAION-AI No Yes 12 2023/3 HF OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
language model Pythia Eleuthera AI No Yes 0.070 - 12 2023/3 Pythia,

Paper
A suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters.
language model StableLM No No 3, 7 April 2023 Github Stability AI's StableLM series of language models
language model Dolly 2.0 DataBricks No Yes 3, 7, 12 2023/4 Dolly An instruction-following LLM, fine-tuned on a human-generated instruction dataset licensed for research and commercial use.
language model DLite 0.124 - 1.5 2023/5 HF Lightweight instruction following models which exhibit ChatGPT-like interactivity.
language model MPT-7B MosaicML No Apache 2 Yes 7 2023/5/5 blog a GPT-style model, and the first in the MosaicML Foundation Series of models.
h2oGPT 12 2023/5 HF h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities.
language model LIMA 65 2023/5 A 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling.
language model RedPajama-INCITE 3, 7 2023/5 HF A family of models including base, instruction-tuned & chat models.
language model Gorilla 7 2023/5 Gorilla Gorilla: Large Language Model Connected with Massive APIs
language model Med-PaLM 2 2023/5 Towards Expert-Level Medical Question Answering with Large Language Models
PaLM 2 2023/5 A Language Model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM.
language model Falcon LLM 7, 40 2023/5 7B40B foundational large language model (LLM) with 40 billion parameters trained on one trillion tokens.
language model Claude Anthropic Yes
language model GPT-Neo Eleuthera AI No Yes
language model GPT-Neox Eleuthera AI No Yes 20 2022/2/1 Paper
language model FastChat-T5-3B LMSYS No Apache Yes 2023/4/
language model OpenLLama openlm-research No Yes
language model OpenChatKit Together No Yes
language model YaLM Yandex No Yes 100 2022/6/1 Github
ChatGLM-6B TsingHua No ChatGLM-6B No 6 2023/3/1 Github
language model Alpaca Stanford No No
language model Vicuna No No 13 2023/3/1 Blog
language model StableVicuna No No
language model RWKV-4-Raven-7B BlinkDL No No
language model Alpaca-LoRA tloen No No
language model Koala BAIR No No 13 2023/4/1 Blog