Skip to content

Latest commit

 

History

History
26 lines (25 loc) · 7.84 KB

NLP_generation.md

File metadata and controls

26 lines (25 loc) · 7.84 KB

NLP - Text Generation

Paper Conference Remarks
SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient AAAI 2016 1. A major difficulty in generating text using GAN is that the discrete outputs from the generative model make it difficult to pass the gradient update from the discriminative model to the generative model. 2. Propose a sequence generation framework, called SeqGAN to model the data generator as a stochastic policy in reinforcement learning (RL). SeqGAN bypasses the generator differentiation problem by directly performing gradient policy update.
Generating Sentences from a Continuous Space CoNLL 2016 1. Introduce and study an RNN-based variational autoencoder generative model that incorporates distributed latent representations of entire sentences. 2. Present techniques such as KL weight annealing and word drop for solving the difficult learning problem presented by this model.
Toward Controlled Generation of Text ICML 2017 1. Aims at generating plausible natural language sentences, whose attributes are dynamically controlled by learning disentangled latent representations with designated semantics. 2. Propose a new neural generative model which combines variational auto-encoders and holistic attribute discriminators for effective imposition of semantic structures. 3. The proposed model learns highly interpretable representations from even only word annotations, and produces realistic sentences with desired attributes.
An Actor-Critic Algorithm for Sequence Prediction ICLR 2017 1. Present an approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL). 2. Address the train/test discrepancy problem by introducing a critic network that is trained to predict the value of an output token, given the policy of an actor network.
Adversarial Generation of Natural Language ACL Workshop 2017 1. Introduce a simple baseline that addresses the discrete output space problem in language generation using GAN without relying on gradient estimators.
Generating Wikipedia by Summarizing Long Sequences ICLR 2018 1. Show that generating English Wikipedia articles can be approached as a multi-document summarization of source documents. 2. Use extractive summarization to coarsely identify salient information and a neural abstractive model to generate the article. 3. Introduce a decoder-only architecture that can scalably attend to very long sequences, much longer than typical encoder- decoder architectures used in sequence transduction.
A Tutorial on Deep Latent Variable Models of Natural Language Arxiv 2018 Explores issues like deep parameterizations of conditional likelihoods usually make posterior inference intractable, and latent variable objectives often complicate backpropagation by introducing points of non-differentiability.
Text Generation from Knowledge Graphs with Graph Transformers NAACL 2019 1. Provide an end-to-end trainable system for graph-to-text generation. 2. Introduce a novel graph transforming encoder which can leverage the relational structure of such knowledge graphs without imposing linearization or hierarchical constraints.
Optimus - Organizing Sentences via Pre-trained Modeling of a Latent Space EMNLP 2020 1. Propose the first large-scale language VAE model, Optimus, where a universal latent embedding space for sentences is first pre-trained on large text corpus, and then fine-tuned for various language generation and understanding tasks. 2. Optimus achieves new state-of-the-art on VAE language modeling benchmarks.
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks TACL 2020 1. Demonstrate the efficacy of pre-trained checkpoints for Sequence Generation. 2. Develope a Transformer-based sequence-to-sequence model that is compatible with publicly available pre-trained BERT, GPT-2 and RoBERTa checkpoints and conducted an extensive empirical study on the utility of initializing our model, both encoder and decoder, with these checkpoints.
Transformer-based Conditional Variational Autoencoder for Controllable Story Generation Arxiv 2020 1. Investigate large-scale latent variable models (LVMs) for neural story generation -- an under-explored application for open-domain long text -- with objectives in two threads: generation effectiveness and controllability. 2. Advocate to revive latent variable modeling, essentially the power of representation learning, in the era of Transformers to enhance controllability without hurting state-of-the-art generation effectiveness. 3. Integrate latent representation vectors with a Transformer-based pre-trained architecture to build conditional variational autoencoder (CVAE).
Trading Off Diversity and Quality in Natural Language Generation Arxiv 2020 1. Cast decoding as a multi-objective optimization problem aiming to simultaneously maximize both response quality and diversity. 2. Conduct the first large-scale evaluation of decoding methods along the entire quality-diversity spectrum. 3. Find that when diversity is a priority, all methods perform similarly, but when quality is viewed as more important, the nucleus sampling outperforms all other evaluated decoding algorithms.
GLGE - A New General Language Generation Evaluation Benchmark Arxiv 2020 1. Present the General Language Generation Evaluation (GLGE), a new multi-task benchmark for evaluating the generalization capabilities of NLG models across eight language generation tasks.
Fˆ2-Softmax - Diversifying Neural Text Generation via Frequency Factorized Softmax EMNLP 2020 1. Argue that the sub-optimal text generation is mainly attributable to the imbalanced token distribution, which particularly misdirects the learning model when trained with the maximum-likelihood objective. 2. Propose two novel methods, Fˆ2-Softmax and MefMax, for a balanced training even with the skewed frequency distribution. 3. Demonstrate performance improvement in not only the diversity but also the quality of generated texts.
BLEURT - Learning Robust Metrics for Text Generation ACL 2020 1. Propose BLEURT, a learned evaluation metric based on BERT that can model human judgments with a few thousand possibly biased training examples. 2. Demonstrate state-of-the-art results on the last three years of the WMT Metrics shared task and the WebNLG Competition dataset.
Best Practices for Data-Efficient Modeling in NLG - How to Train Production-Ready Neural Models with Less Data COLING 2020 1. Describe a family of sampling and modeling techniques to attain production quality with light-weight neural network models using only a fraction of the data that would be necessary otherwise, and show a thorough comparison between each. 2. Show that domain complexity dictates the appropriate approach to achieve high data efficiency.
A Distributional Approach to Controlled Text Generation ICLR 2021 1. Propose a Distributional Approach to address Controlled Text Generation from pre-trained Language Models (LMs), which permits to define, in a single formal framework, “pointwise” and “distributional” constraints over the target LM. 2. Demonstrate a controlled LM balancing constraint satisfaction with divergence from the initial LM (GPT-2).

Back to index