StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer

Paper

StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer
Yiwei Lyu*, Paul Pu Liang*, Hai Pham*, Eduard Hovy, Barnabas Poczos, Ruslan Salakhutdinov, and Louis-Philippe Morency
NAACL 2021. (*equal contribution)

If you find this repository useful, please cite our paper:

@article{lyu2021styleptb,
  title={StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer},
  author={Lyu, Yiwei and Liang, Paul Pu and Pham, Hai and Hovy, Eduard and P{\'o}czos, Barnab{\'a}s and Salakhutdinov, Ruslan and Morency, Louis-Philippe},
  journal={arXiv preprint arXiv:2104.05196},
  year={2021}
}

Installation

First check that the requirements are satisfied:
Python 3.6
torch 1.2.0
numpy 1.18.1

The next step is to clone the repository:

git clone https://github.com/lvyiwei1/StylePTB.git

StylePTB Dataset

To checkout single style transfers, use single_transform_checkout.py with the three letter style code as follows:

python single_transform_checkout.py [3-letter style code]

After you run the script, the data will be contained in a folder with the 3-letter code as name.

3 letter style codes:

TFU == To Future

TPA == To Past

TPR == To Present

ATP == Active To Passive

PTA == Passive To Active

PFB == PP Front To Back

PBF == PP Back To Front

IAD == Information Addition

ARR == ADJ/ADV Remooval

SBR == Substatement Removal

PPR == PP Removal

AEM == ADJ Emphasis

VEM == Verb Emphasis

NSR == Noun Synonym Replacement

ASR == Adjective Synonym Replacement

VSR == Verb Synonym Replacement

NAR == Noun Antonym Replacement

AAR == Adjective Antonym Replacement

VAR == Verb Antonym Replacement

LFS == Least Frequent Synonym Replacement

MFS == Most Frequent Synonym Replacement

To access the compositional datasets, a few of them are provided in the "Compositional Datasets" folder. No checkout needed.

Running the model

The scripts for GPT baseline, GRU+attn, Retrieve-Edit, and StyleGPT are in the "Model Codes" folder. Note that the code for Retrieve-Edit is taken directly from the codes provided by the authors of "A Retrieve-and-Edit Framework for Predicting Structured Outputs" (NeurIPS2018)

Evaluation

We used maluuba nlg-eval to evaluate model performance using automated metrics. See https://github.com/Maluuba/nlg-eval for instructions on evaluation

Other scripts

The scripts used to perform automated transfers with parse trees are in the "Automatic Transfer Scripts", and the webpages and full results of the human annotated transfers are in "Amazon Mechanical Turk Webpages and Results" folder.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Amazon Mechanical Turk Webpages and Results		Amazon Mechanical Turk Webpages and Results
Automatic Transfer Scripts		Automatic Transfer Scripts
Compositional Datasets		Compositional Datasets
Model Codes		Model Codes
LICENSE		LICENSE
README.md		README.md
fulldata.h16		fulldata.h16
single_transform_checkout.py		single_transform_checkout.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amazon Mechanical Turk Webpages and Results

Amazon Mechanical Turk Webpages and Results

Automatic Transfer Scripts

Automatic Transfer Scripts

Compositional Datasets

Compositional Datasets

Model Codes

Model Codes

LICENSE

LICENSE

README.md

README.md

fulldata.h16

fulldata.h16

single_transform_checkout.py

single_transform_checkout.py

Repository files navigation

StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer

Paper

Installation

StylePTB Dataset

Running the model

Evaluation

Other scripts

About

Releases

Packages

Contributors 2

Languages

License

lvyiwei1/StylePTB

Folders and files

Latest commit

History

Repository files navigation

StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer

Paper

Installation

StylePTB Dataset

Running the model

Evaluation

Other scripts

About

Resources

License

Stars

Watchers

Forks

Languages