Skip to content

allenai/twentyquestions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

twentyquestions

Code for crowdsourcing commonsense with 20 Questions.

This repository includes:

  • A web application for playing 20 Questions on Mechanical Turk.
  • amti HIT Definitions for running tasks on Mechanical Turk.
  • Scripts for post-processing the data derived from the HITs.

Documentation

To get a quick overview of the repository, jump to Quickstart. Otherwise, you can find topic specific documentation below:

  • Setup: How to fully set up the repository for deploying the 20 Questions app to cloud infrastructure and running HITs on Mechanical Turk.
  • Development: In-depth development documentation, including serving the app in development and production, as well as deploying to Kubernetes.
  • Running HITs: How to run the HITs on Mechanical Turk using amti.

Quickstart

To quickly get things up and running, first install the dependencies:

$ pip install -r requirements.txt

Then, run the unittests:

$ python -m unittest

Assuming everything checks out, you can now execute the manage.py script, the main interface for working with this repository:

$ python manage.py --help
Usage: manage.py [OPTIONS] COMMAND [ARGS]...

  A high-level interface to admin scripts for twentyquestions.

Options:
  -v, --verbose        Turn on verbose logging for debugging purposes.
  -l, --log-file TEXT  Log to the provided file path instead of stdout.
  -h, --help           Show this message and exit.

Commands:
  build                  Build twentyquestions.
  create_splits          Write splits for the 20Qs data at DATA_PATH...
  deploy                 Deploy twentyquestions to ENV.
  dockerize              Create the docker image for running...
  extractgames           Extract games from XML_DIR and write to...
  extractlabels          Extract labeling data from XML_DIR and write...
  extractmirrorsubjects  Extract mirror subjects from XML_DIR and...
  extractquality         Extract quality labels from XML_DIR and write...
  extractquestions       Extract questions from XML_DIR and write to...
  extracttypes           Extract commonsense types from XML_DIR and...
  groupbysubject         Group the data in blocks of at most 20 by...
  promote                Promote the docker image from SOURCE to DEST.
  serve                  Serve twentyquestions on port 5000.

The manage.py script is self-documenting, and lists out all the actions you might want to perform using the code base. Use the --help option to view additional documentation. For example:

$ python manage.py build --help
Usage: manage.py build [OPTIONS]

  Build twentyquestions.

  Build twentyquestions by building the frontend client and copying it into
  the proper location for the backend.

Options:
  -h, --help  Show this message and exit.

To deploy the twentyquestions application to infrastructure, or to run the HITs on Mechanical Turk, you'll need to follow the in-depth setup documentation. If you want to run or extend the code, checkout the development documentation; or, if you want to run the HITs on Mechanical Turk see Running HITs.

Data

The 20 Questions data can be found here. The tarball contains 20 Questions style questions and answers along with other metadata. In it, you'll find a README.md describing all the attributes in more detail.

Contact

This code was authored by Nick Lourie for Mosaic, reach out to him with any questions.

About

A web application for playing 20 Questions to crowdsource common sense. 🤖

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published