GitHub - zheng-yanan/techniques-for-kl-vanishing: This repository summarizes techniques for KL divergence vanishing problem.

Techniques for KL Vanishing Problem Revisited

Background

The variational auto-encoder (VAE) has attracted much atttention from the NLP community, and has already achieved promising results in various NLP tasks. However, a challenging problem that is commonly figured out in many works is the KL vanishing problem (also denoted as posterior collapse sometimes). It is expected that the VAE could learn a good latent distribution and generate well-formed texts conditioned on samples from it. However, the practical fact is that the VAE totally ignores the latent representations when decoding, and fails learning the latent distribution. In this way, since latent representations carry no useful information, the whole VAE degenerates into a standard language model. Here we reviewing related works for handling KL vanishing.

Solutions for KL Vanishing

KL Annealing: Generating Sentences from a Continuous Space (ICLR2016).
- Linear Annealing Schedule
- Sigmoid Annealing Schedule
Word Dropout:
Generating Sentences from a Continuous Space (ICLR2016).
Bag-of-word Loss:
Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders (2016).
Inverse Auto-regressive Flow:
Improved Variational Autoencoders with Inverse Autoregressive Flow (NIPS2016).
Collaborative Variational Encoder-Decoder:
Improving Variational Encoder-Decoders in Dialogue Generation (AAAI2018).
Cyclical Annealing:
Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing (NAACL2019).

[To Be Continued]

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
LICENSE		LICENSE
README.md		README.md
config.py		config.py
data_utils.py		data_utils.py
log_utils.py		log_utils.py
model.py		model.py
model_utils.py		model_utils.py
trainer.py		trainer.py
tsne_plot.py		tsne_plot.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

LICENSE

LICENSE

README.md

README.md

config.py

config.py

data_utils.py

data_utils.py

log_utils.py

log_utils.py

model.py

model.py

model_utils.py

model_utils.py

trainer.py

trainer.py

tsne_plot.py

tsne_plot.py

Repository files navigation

Techniques for KL Vanishing Problem Revisited

Background

Solutions for KL Vanishing

About

Releases

Packages

Languages

License

zheng-yanan/techniques-for-kl-vanishing

Folders and files

Latest commit

History

Repository files navigation

Techniques for KL Vanishing Problem Revisited

Background

Solutions for KL Vanishing

About

Topics

Resources

License

Stars

Watchers

Forks

Languages