Evidence graphs for parsing argumentation structure

About

This repository holds the code of the Evidence Graph model, a model for parsing the argumentation structure of text.

It basically is a re-implementation of the model presented first in (1). Most work was done 2016-2017. It was used in the experiments of (2), (3) and (4).

Prerequisites

This code runs in Python 3.8. It is recommended to install it in a separate virtual environment. Here are installation instructions for an Ubuntu 18.04 linux:

# basics
sudo apt install python3.8-dev
# for lxml
sudo apt install libxml2-dev libxslt1-dev
# for matplotlib
sudo apt install libpng-dev libfreetype6-dev
# for graph plotting
sudo apt install graphviz

Setup environment

Install all required python libaries in the environment and download the language models required by the spacy library.

make install-requirements
make download-spacy-data-de
make download-spacy-data-en

Furthermore, several microtext corpora required for the experiments can be downloaded with:

make download-corpora

Test

Make sure all the tests pass.

make test

Run a minimal experiment

Run a (shortened and simplified) minimal experiment, to see that everything is working:

env/bin/python src/experiments/run_minimal.py --corpus m112en

You should (see last lines of the output) get an average macro F1 of the base classifiers similar to:
(cc ~= 0.82, ro ~= 0.75, fu ~= 0.74, at ~= 0.72).

Evaluate the results, which have been written to data/:

env/bin/python src/experiments/eval_minimal.py --corpus m112en

You should (see first lines of the output) get an average macro F1 for the decoded results similar to:
(cc ~= 0.86, ro ~= 0.74, fu ~= 0.76, at ~= 0.71).

Replicate published results

Adjust run_minimal.py:

Remove the line folds = folds[:5] in order to run all 50 train/test splits.
In the experimental conditions, set optimize to True so that the local model's hyperparameters are optimized.
In the experimental conditions, set optimize_weights to True so that the global model's hyperparameters are optimized.

For more details, see the actual experiment definitions in src/experiments.

Note that the results published in the papers were obtained using the Python 2 version of this code base. With the migration to Python 3 and various updated dependencies, the scores differ slightly. To reproduce the exact published scores, you will need to run version v0.4.0 of this code base.

Reusing / extending components of the library

Use the same features for a new language

Load a spacy nlp for the desired language and pass it together with a connective lexicon to the TextFeatures.

from evidencegraph.features_text import TextFeatures
from evidencegraph.classifiers import EvidenceGraphClassifier

my_features = TextFeatures(
    nlp=spacy.load("klingon"),
    connectives={}, # add a connective lexicon here
    feature_set=TextFeatures.F_SET_ALL_BUT_VECTORS
)
clf = EvidenceGraphClassifier(
    my_features.feature_function_segments,
    my_features.feature_function_segmentpairs
)

Use a custom base classifier

Derive a custom base classifier class (stick to the interface) and pass this class to the EvidenceGraphClassifier.

from evidencegraph.classifiers import BaseClassifier

class MyBaseClassifier(BaseClassifier):
    # do something different here
    pass

clf = EvidenceGraphClassifier(
    my_features.feature_function_segments,
    my_features.feature_function_segmentpairs,
    base_classifier_class=MyBaseClassifier
)

Load a custom corpus

Simply load a folder containing argument graph xml files into a GraphCorpus.

from evidencegraph.corpus import GraphCorpus

corpus = GraphCorpus()
corpus.load("path/to/my/folder")
texts, trees = corpus.segments_trees()

References

Joint prediction in MST-style discourse parsing for argumentation mining
Andreas Peldszus, Manfred Stede.
In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), Portugal, Lisbon, September 2015.
Automatic recognition of argumentation structure in short monological texts
Andreas Peldszus.
Ph.D. thesis, Universität Potsdam, 2018.
Comparing decoding mechanisms for parsing argumentative structures
Stergos Afantenos, Andreas Peldszus, Manfred Stede.
In: Argument & Computation, Volume 9, Issue 3, 2018, Pages 177-192.
More or less controlled elicitation of argumentative text: Enlarging a microtext corpus via crowdsourcing
Maria Skeppstedt, Andreas Peldszus, Manfred Stede.
In: Proceedings of the 5th Workshop on Argument Mining. EMNLP 2018, Belgium, Brussels, November 2018.

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.github/workflows		.github/workflows
data		data
resources		resources
src		src
test		test
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py

License

peldszus/evidencegraph

Folders and files

Latest commit

History

Repository files navigation

Evidence graphs for parsing argumentation structure

About

Prerequisites

Setup environment

Test

Run a minimal experiment

Replicate published results

Reusing / extending components of the library

Use the same features for a new language

Use a custom base classifier

Load a custom corpus

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages