GitHub - tchewik/discourse-aware-classification: The RST-LSTM module refines predictions of the sequential text classifier (BERT) on documents with complex discourse

Model description

The RST-LSTM module is used to refine predictions of a high-performance sequential text classifier (BERT) on documents with rhetorical structure.

Training pipeline

The first stage involves fine-tuning the sequential model on the dataset including texts of different lengths and complexity.
In the second stage, we freeze the base model and then train a discourse-aware neural module on top of it for the classification of texts with discourse structure.

Prediction pipeline

The text is parsed with end-to-end RST parser
Predictions are obtained on each discourse unit in the structure with the BERT
Non-elementary discourse structures with assigned BERT predictions go through the trained RST-LSTM

RuARG-2022

This repository is for applying this method on RuARG-2022 argument mining shared task.

Requirements

AllenNLP == 2.9.3
IsaNLP RST parser for Russian

Code

*.ipynb - Data analysis, scripts for training and evaluation.
models_scripts/ - BERT-based and RST-LSTM-based classifiers scripts for AllenNLP.
- Both classifiers predict two labels (Stance and Premise) jointly.
- RST-LSTM includes both Child-sum and Binary options for Tree LSTM (no significant difference was found for the current task, Binary by default).

Reference

Further information and examples can be found in our paper:

@INPROCEEDINGS{chistova2022dialogue,
      author = {Chistova, E. and Smirnov, I.},
      title = {Discourse-aware text classification for argument mining},
      booktitle = {Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference "Dialogue" (2022)},
      year = {2022},
      number = {21},
      pages = {93--105}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
models_scripts		models_scripts
.gitignore		.gitignore
1. Language features.ipynb		1. Language features.ipynb
2. Neural baseline - BERT.ipynb		2. Neural baseline - BERT.ipynb
3. RST-LSTM.ipynb		3. RST-LSTM.ipynb
4. Evaluation.ipynb		4. Evaluation.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models_scripts

models_scripts

.gitignore

.gitignore

1. Language features.ipynb

1. Language features.ipynb

2. Neural baseline - BERT.ipynb

2. Neural baseline - BERT.ipynb

3. RST-LSTM.ipynb

3. RST-LSTM.ipynb

4. Evaluation.ipynb

4. Evaluation.ipynb

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Model description

Training pipeline

Prediction pipeline

RuARG-2022

Requirements

Code

Reference

About

Languages

License

tchewik/discourse-aware-classification

Folders and files

Latest commit

History

Repository files navigation

Model description

Training pipeline

Prediction pipeline

RuARG-2022

Requirements

Code

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Languages