Skip to content

yzhangcs/ctc-copy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Non-autoregressive Text Editing with Copy-aware Latent Alignments

1Soochow University, Suzhou, China
2Tencent AI Lab

conf arxiv citation python

image

Citation

If you are interested in our work, please cite

@inproceedings{zhang-etal-2023-ctc,
  title     = {Non-autoregressive Text Editing with Copy-aware Latent Alignments},
  author    = {Zhang, Yu  and
               Zhang, Yue  and
               Cui, Leyang  and
               Fu, Guohong},
  booktitle = {Proceedings of EMNLP},
  year      = {2023},
  address   = {Singapore}
}

Setup

The following packages should be installed:

Clone this repo recursively:

git clone https://github.com/yzhangcs/ctc-copy.git --recursive

You can follow this repo to obtain the 3-stage train/dev/test data for training a English GEC model. The multilingual datasets are available here.

Before running, you are required to preprocess each sentence pair into the format of SRC:\t[src]\nTGT:\t[tgt]\n, where src and tgt are the source and target sentences, respectively. Each sentence pair is separated by a blank line. See data/clang8.toy for examples.

Run

Try the following command to train a 3-stage English model,

bash train.sh

To make predictions & evaluations:

bash pred.sh

Contact

If you have any questions, please feel free to email me.

About

[EMNLP'23] Code for "Non-autoregressive Text Editing with Copy-aware Latent Alignments".

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published