Exploring Neural Text Simplification--Pytorch Version

This is the reimplementation of the NeuralTextSimplification repository in Pytorch. The original repository is based on Lua Torch, which may not be able to be installed in some machines (at least in my machine), therefore I provide this pytorch version in case someone may need it.

The algorithm behind this code is from this paper: Nisioi, Sergiu, et al. "Exploring neural text simplification models." Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2017.

It is based on the standard LSTM based seq-to-seq translation model and OpenNMT is used as the code base.

How to use

OpenNMT dependency: You first need to install the OpenNMT tool:

pip install OpenNMT-py

Checkout this repository:

git clone https://github.com/jind11/NeuralTextSimplification-Pytorch

Make a directory named "models", download the pre-trained released models NTS and save into it. If you want to train your own model based on your data, you can use this command (remember to change the directory of the data and model save path):

./train.sh

We also provide the EW-SEW dataset used to train the released pre-trained model in the "data" folder.

Run translate.sh to get the translation results for your dataL

mkdir results
./translate.sh

Run automatic evaluation metrics (nltk package is needed for this step):

./evaluate.sh

Benchmark

Since this is a reimplementation of an existing repository, we would like to compare the performance to the original one for quality checking based on two automatic metrics: SARI and BLEU.

Approach	Repository	SARI	BLEU
NTS default (beam 5, hypothesis 1)	Original	30.65	84.51
NTS default (beam 5, hypothesis 1)	This one	29.90	93.67
NTS SARI (beam 5, hypothesis 2)	Original	37.25	80.69
NTS SARI (beam 5, hypothesis 2)	This one	38.63	87.19
NTS BLEU (beam 12, hypothesis 1)	Original	30.77	84.70
NTS BLEU (beam 12, hypothesis 1)	This one	29.78	93.71

From this table, we see that this reimplementation is comparable or even better than the original code.

In the end, we put the automatic metrics results for all four hypotheses for beam search of 5:

Hypothesis Number	SARI	BLEU
1	29.90	93.67
2	38.63	87.19
3	38.65	84.67
4	37.92	84.19

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
data		data
README.md		README.md
evaluate.py		evaluate.py
evaluate.sh		evaluate.sh
preprocess.sh		preprocess.sh
train.sh		train.sh
translate.sh		translate.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

data

data

README.md

README.md

evaluate.py

evaluate.py

evaluate.sh

evaluate.sh

preprocess.sh

preprocess.sh

train.sh

train.sh

translate.sh

translate.sh

Repository files navigation

Exploring Neural Text Simplification--Pytorch Version

How to use

Benchmark

About

Releases

Packages

Languages

jind11/NeuralTextSimplification-Pytorch

Folders and files

Latest commit

History

Repository files navigation

Exploring Neural Text Simplification--Pytorch Version

How to use

Benchmark

About

Topics

Resources

Stars

Watchers

Forks

Languages