Skip to content

Milestone 1: Working pipeline on small novel usecase

Past due by about 1 month 68% complete

Dataset

  • Forget set: Biographical author details across all authors
  • Retain set: All non-biographical questions & answers
  • Train set: All synthetic authors data

Fine-tuning

  • Fine-tune a small model (e.g. GPT2) on train set
  • Fine-tune a small model (e.g. GPT2) on retain set

Forgetting

  • Unlearn the forget set using gradient ascent [& another?]

Evaluating

  • R…

Dataset

  • Forget set: Biographical author details across all authors
  • Retain set: All non-biographical questions & answers
  • Train set: All synthetic authors data

Fine-tuning

  • Fine-tune a small model (e.g. GPT2) on train set
  • Fine-tune a small model (e.g. GPT2) on retain set

Forgetting

  • Unlearn the forget set using gradient ascent [& another?]

Evaluating

  • Run forget quality and utility evaluation.
  • Initial sanity check (debug fine-tuning): GPT2 base model vs. GPT2 fine-tuned on all TOFU
  • Then: GPT2 unlearnt model vs. GPT2 fine-tuned on retain (gold standard)