Skip to content

dora-tang/SemEval-2020-Task-7

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Duluth at SemEval-2020 Task 7

This is the codebase for our SemEval 2020 paper: Duluth at SemEval-2020 Task 7: Using Surprise as a Key to Unlock Humorous Headlines.

@inproceedings{duluth2020humor,
    title = "Duluth at SemEval-2020 Task 7: Using Surprise as a Key to Unlock Humorous Headlines",
    author = "Shuning Jin and Yue Yin and XianE Tang and Ted Pedersen",
    booktitle = "Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval 2020)",
    year = "2020",
    url = "https://arxiv.org/abs/2009.02795"
}

Task Introduction

SemEval-2020 Task 7: Assessing Humor in Edited News Headlines

  • Competition page
    • Subtask 1: regression task to predict the funniness score of an edited headline
    • Subtask 2: classification task to predict the funnier between two edited headlines
  • Leaderboard for offical evaluation
    • webpage version: “Evaluation-Task-1” is for Subtask 1 and “Evaluation-Task-2” is for Subtask 2.
    • cleaned csv version
    • Our system ranks 11/49 (0.531 RMSE) in Subtask 1, and 9/32 (0.632 accuracy) in Subtask 2.

1 Configuration

conda environment

  • packages are specified in environment.yml

  • require conda3: Anaconda 3 or Miniconda 3

    • create conda environment:

      conda env create -f environment.yml
    • activate/deactivate the environment:

      # linux/mac (conda>=3.6):
      conda activate humor
      conda deactivate
      # linux/mac (conda<3.6):
      source activate humor
      source deactivate
      # windows:
      activate humor
      deactivate

spaCy

python -m spacy download en

HuggingFace Transformers Cache Directory

we use the transformers library by HuggingFace. Save caches so you don't have to download the same model more than once.

# replace `/path/to/cache/directory` with your directory
CACHE=/path/to/cache/directory
bash scripts/path_setup.sh HUGGINGFACE_TRANSFORMERS_CACHE $CACHE
# this will add the following line to ~/.bash_profile (mac) or ~/.bashrc (linux)
# export HUGGINGFACE_TRANSFORMERS_CACHE=/path/to/cache/directory

2 Data

  • see data directory
  • datasets: Humicroedit (official task data) and Funlines (additional training data)
  • you can download the data from the source website, or simply run
    bash scripts/download_data.sh
    This gives the same data as in data directory.

3 Experiment output

experiment_directory
├── log.log
├── params.json
# if args.save_model
├── model_state.th
# if args.tensorboard
├── tensorboard_train
├── tensorboard_val
# if args.do_eval
└── output-{eval_data_name}.csv

To see tensorboard output:

open http://localhost:6006
tensorboard --logdir tensorboard_train
# you may need to wait a few seconds and refresh the page

open http://localhost:6006
tensorboard --logdir tensorboard_val
# you may need to wait a few seconds and refresh the page

4 Scripts

  • Baseline

    • Baseline 1 uses the average score; Baseline 2 uses the majority label.

    • output will be in baseline_output directory

      bash scripts/baseline.sh > baseline_output/results.log
  • To reproduce our main experiment based on the contrastive approach (Table 2 in the paper):

    • produce the highlighted numbers (i.e., best within each model type), with this script

      bash scripts/table2_best.sh
    • produce the whole table, with this script

      bash scripts/table2.sh
    • Table 2

  • To reproduce the additional analysis on the non-contrastive approach (Table 3 in the paper)

    • use this script

      bash scripts/table3.sh
    • Table 3

About

Assessing Humor in Edited News Headlines

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published