Skip to content

Latest commit

 

History

History
16 lines (14 loc) · 678 Bytes

benchmark.md

File metadata and controls

16 lines (14 loc) · 678 Bytes

Introduction

Here we list all the program-of-thoughts results through program generation

Few-shot Prompting

Model Params GSM8K TheoremQA
ChatGPT ? 76.3 35.6
Codex 175B 71.6 23.9
GPT-3 175B 60.4 16.6
PaLM 540B 51.3 -
PaLM-Coder 540B 50.9 -
codegen-mono 15B 12.7 11.8
codet5+ 15B 12.5 11.6
xgen 7B 11.0 11.4
codegen-multi 15B 8.2 10.2