[Experiments] Various experiments related to MindGPT #67

dudeperf3ct · 2023-08-22T11:32:14Z

In this PR, we add 4 notebooks each focusing on a different experiment. Each notebook is a fully reproducible experiment environment.

big_models_experiment.ipynb: This notebook compares responses for NHS and Mind across various LLM models. Starting with flan-t5-base and keeping the rest of the variables constant, 5 additional LLM models were tested flan-t5-large, flan-t5-xl, fastchat-t5, openllam-3b and redpajama-3b-instruct.
Data Chunking: There are two parts of this experiments captured in two separate notebook
- data_chunking_experiments.ipynb: This notebook handpick small subset of dataset relevant for the question and experiments with different sizes of chunk_size and chunk_overlap.
- data_chunking_experiments_third_version_openllm.ipynb: This notebook takes similar approach above where instead of handpicking datapoints entire dataset (the latest version) is used. Two additional improvements have been added i.e. flan-t5-xl LLM and advanced template is used.
prompt_engineering_experiment.ipynb: This notebook experiments with 5 different prompts for 3 LLMs ( flan-t5-base , flan-t5-large and flan-t5-xl).

dudeperf3ct added 25 commits August 3, 2023 13:11

add notebook

cf9df9a

add notebook

1f16fa5

merge develop

25ea3af

fix dataset for experimentation

fab5e1b

merge develop

417364a

add experiment using subset of example

65a1bfe

add experiment for different chunk size

c9b6de4

simplify experiments

dff0db5

add table for results

ccd8d28

add more experiments

5fcb37b

merge develop

5ae49bc

Update dvc files

9bce1a3

Update dvc files

d5470f0

Update dvc files

1d10142

Update dvc files

3ae201d

add bigger model results

aef3f0c

add new prompt template experiment

6f847a2

prompt template comparison table

ec4c188

remove flan-t5-xl section

e3706cc

additional prompts

ce36a23

remove unused variable

2f27607

add flan-t5-xl results

da9d5c2

Update dvc files

66f025a

Update dvc files

be06b18

add new notebooks for data chunking exp

a43b5bb

dudeperf3ct self-assigned this Aug 22, 2023

dudeperf3ct added the enhancement New feature or request label Aug 22, 2023

dudeperf3ct added 2 commits August 23, 2023 09:30

rebuild bigger models notebook

cf21025

update bigger models notebook

c7dcd4e

Provide feedback