Skip to content

thaihungle/VMED

Repository files navigation

VMED

source code for Variational Memory Encoder-Decoder

arXiv version: https://arxiv.org/abs/1807.09950
NIPS version: https://nips.cc/Conferences/2018/Schedule?showEvent=11166
repo reference: https://github.com/Mostafa-Samir/DNC-tensorflow https://github.com/Conchylicultor/DeepQA

Please prepare your conversation data as follows:

  • A pickle file contains 3 objects: str2tok, tok2str, dialogs
  • str2tok and tok2str are dicitonaries mapping from word to index and index to word, respectively
  • index 0,1,2 should be spared for special words: pad, go, eos
  • dialogs is a list of all conversation pairs. Each of its elements is another list of two lists, corresponding to the input sequence and output sequence. The sequences contain index of the word in the vocab. Special words will be added later (e.g, [[269, 230, 54, 94, 532, 23], [90, 64, 269, 125, 35, 94, 532, 9, 61, 1529]])
  • To simulate conversations with multiple pairs, just concatenate all sequences until the response moment as the input sequence, and the ground truth response as the output sequence
  • Please refer https://github.com/Conchylicultor/DeepQA for data preprocessing details

To run the code:

  • train VMED example: python qa_task.py --mode=train --num_mog_mode=3 --mem_size=15 --data_dir='path_to_pickle'
  • test VMED example: python qa_task.py --mode=test --num_mog_mode=3 --mem_size=15 --data_dir='path_to_pickle'
  • VLSTM example: python qa_task.py --mode=train --num_mog_mode=1 --use_mem=False --data_dir='path_to_pickle'
  • CVAE example: python qa_task.py --mode=train --num_mog_mode=1 --use_mem=False --single_KL=True --data_dir='path_to_pickle'

Run with word embedding:

  • set --use_pretrain_emb value (word2vec or glove)
  • hard code to modify path to embedding files

Feel free to modify the hyper-parameters (some are currently hard coded), add beam search and other advanced features

About

Variational Memory Encoder-Decoder

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages