Skip to content

This is an NLP task from AI CUP 2022. We redefine this task into an extractive summarization task, which can also be regarded as a sentence classification task. By calculating scores for each sentence, we decompose sentences and recompose sentences with high scores into the arguments.

Notifications You must be signed in to change notification settings

MengChiehLiu/AI_CUP_2022_Argument_Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI CUP 2022: Argument Detection

Task Description

Input data

  • q: An English discourse.
  • r: An English text that responds to q.
  • s: The discussion relationship between r and q.

Output data

  • 𝒒′ & 𝒓′: Subsequences of q and r respectively, and 𝑞′ and 𝑟′provides key information, enough to judge the relationship between q and r presenting s.

See more task description here.

Introduction

  • We redefine this task into an extractive summarization task, which can also be regarded as a sentence classification task. By calculating scores for each sentence, we decompose sentences and recompose sentences with high scores into the arguments.

Core Model Structure

Core Model Structure

  • We concat sequence with q and r into two sequences. For the maximum token length for BERT input is 512, we summarized q and r into 450 tokens at most first, which is made by using pretrained bert-extractive-summarization library. After that, we feed the sequences into BERTs and get thier pooler outputs sq and sr, concating them with their inner dot production sq*sr. Finally, we connect the concated result with dense layers and get the score. Besides the main task for sequence classification, we also design a co-task for s(relationship) classification to help.

Usuage

Requirements

pip install -q torch pytorch-lightning
pip install -q transformers
pip install -q bert-extractive-summarizer
pip install -q nltk==3.7

Model Path

https://drive.google.com/file/d/1-QVxqGodzD0FEQNVOqDsWsgHfu9fkHwO/view?usp=share_link

Train

(Remember to change all file paths into yours!!!)

  1. Download train data here.
  2. Run utils/preprocessing.ipynb for data preprocessing. You may also just use processed data here.
  3. Run train.ipynb to start training.

Predict

(Remember to change all file paths into yours!!!)

  1. Download test data here.
  2. Download pretrained model here or from link above.
  3. Run predict.ipynb to start predicting.

Final Scores

Our best answer is here.

  • Public Score: 0.815819
  • Private Score: 0.867684

About

This is an NLP task from AI CUP 2022. We redefine this task into an extractive summarization task, which can also be regarded as a sentence classification task. By calculating scores for each sentence, we decompose sentences and recompose sentences with high scores into the arguments.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published