Skip to content

jwolle1/jeopardy_clue_dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

jeopardy_clue_dataset

Jeopardy! Logo

This dataset contains Jeopardy! clues from Season 1 through Season 39 (July 2023). It does not contain every clue that has appeared on the show. The data source prefers not to be credited.

There are 473,067 clues in total. Most of them can be found in combined_season1-39.tsv. This file is approx. 68 MB.

There are also individual files for each season (located in the seasons folder). These files are small enough that you should be able to open them with Microsoft Excel or Google Sheets.

  • Seasons 1–11 average 8,821 clues each.
  • Seasons 12–38 average 13,260 clues each.

There is a kids_teen.tsv file which contains only clues that appeared in Kids and Teen Tournament matches. These clues are in the combined dataset but this file is included for convenience.

Clues appearing in special matches outside the daily syndicated program are found in extra_matches.tsv. This file has 4,750 clues and they do not appear in the combined dataset.

I've done my best to clean the data and filter out clues that depend on images, video, or audio.


Column Information

Label Description
round 1 for Single Jeopardy, 2 for Double Jeopardy, or 3 for Final Jeopardy. (Note: These values are different in extra_matches.tsv to account for Triple Jeopardy.)
clue_value The clue's value on the board before any Daily Double wagering.
daily_double_value If the clue is a Daily Double, this column is the amount wagered. Otherwise it's zero.
category i.e. the top row of the board.
comments The host's comments about a category.
answer The prompt given to contestants.
question The correct response.
air_date The calendar date on which the episode first aired.
notes Misc. information about the clue, e.g. if it's from a special tournament match.

Other Data

A file with contestant scoring data can be found in the other_data folder. There are columns for each contestant's score after the Single, Double, and Final Jeopardy rounds. Most but not all episodes from combined_season1-39.tsv are included.


FAQ

How do I download the dataset?

If you're new to Github and aren't sure what's going on, click the green Code button near the top of the page, then click Download ZIP.

What is a .TSV file?

The data is written in plain text and organized like a spreadsheet with a TAB character between each cell. You can open the files with applications like Microsoft Excel or Google Sheets.


All data is property of Jeopardy Productions, Inc. and protected under law. I am not affiliated with the show. Please don't use the data to make a public-facing web site, app, or any other product.

About

A dataset containing 473,000 Jeopardy! clues (1984–2023).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published