Skip to content

DN6/whisper-eval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Evaluate OpenAI's Whisper on (almost) any Hugging Face Hub dataset

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web.

It enables transcription in multiple languages, as well as translation from those languages into English.

This notebook provides an easy to use interface to evaluate Whisper on audio recordings of text passages that have been sampled from Hugging Face Datasets.

The notebook sets up a Gradio UI that allows the user to:

  1. Sample text passages from any Dataset hosted on the Hugging Face Hub
  2. Record an audio snippet narrating the text,
  3. Transcribe the audio with Whisper
  4. Save the audio, transcribed and reference text, and word error rate to Comet for further evaluation and analysis.

The Evaluation UI

Open In Colab

whisper-eval

To use the Evaluation UI, you will need a Comet account.

The UI has the following input components

  1. The Dataset Name. This is the root name of the text dataset on the Hugging Face hub.
  2. Subset. Certain datasets on the Hub are divided into subsets. Use this field to identify the subset, if there is no subset, you can leave it blank.
  3. Split. Which split of the dataset to sample from. e.g. train, test etc.
  4. Seed. Set the random seed for sampling. If the seed isn't set, a random one is automatically generated.
  5. Audio. One you have sampled a text passage, hit the record button to record yourself narrating the passage.

Finally hit the Transcribe button. That's it! All the relevant data will be logged to Comet for further analysis.

Logging Evaluations to Comet

The Evaluation UI will log the following data to a Comet Experiment.

You can check out an example project here

Model Parameters

  1. Model Type (tiny, base, small, medium etc)
  2. Beam Search Width
  3. Model Language

Dataset Parameters

  1. Dataset Name
  2. Dataset Subset
  3. Column in the Dataset that contained the text
  4. Split (train, test, etc.) of the dataset
  5. Seed used to sample the text passage from the dataset
  6. Sample Text Length

Model Metrics

  1. Word Error Rate Score

Evaluation Assets

  1. Audio Snippet of the narrated text
  2. Reference/Sampled Text
  3. Transcribed Text
invidual-experiment.mp4

About

Evaluate OpenAI's whisper model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published