Skip to content

RNN Seq2Seq Based Abstract Summarization(ABS) On Tensorflow

Notifications You must be signed in to change notification settings

thinkwee/Abstract_Summarization_RNN

Repository files navigation

Note: this code is no longer actively maintained.

introduction

  • attention based summarization on tensorflow using seq2seq model
  • my graduation project code
  • do not provide data for the time

environment

  • ubuntu 16.04 lts
  • anaconda python 3.6
  • recompiled tensorflow r1.7 gpu version
  • CUDA 9.0
  • cudnn 7.1.2
  • rouge

run

  • This work use Gigaword dataset which is not for public. You need fetch the data yourself.
  • The SentiWordNet 3.0 dataset can be found here :SentiWordNet3.0
  • The codes are written in an early version of tensorflow. I do not recommend run this code directly. Just for reference.
  • run python main.py -help for help.
  • run python main.py -w2v to train the wordvector from Gigaword dataset using Word2Vec,then run python main.py -train to train the model and python main.py -testto test the model(just get the output of testset).
  • you need install ROUGE to test the output. All the results are collected in the original PERL version of ROUGE. Using PyRouge make cause the result a little bit higher.

progress

  • finish word embedding matrix
  • build seq2seq model
  • test lstm and gru core
  • test bidirectional core
  • fix infer problem
  • test multilayer with dropout core
  • fix lazy loading
  • fix pre-processing
  • try training with non-mentor model
  • secondary activation
  • test attention decoder(luong attention)
  • choose last batch in each epoch as the validation set
  • learning rate decay:gradient descent,low init value,decay=0.995
  • cut vocab size to 3000,replace unusual word to unk
  • enlarge rnn hidden units size
  • fix word embedding matrix and try to load model
  • divide infer and train into two graphs
  • use rouge to value model
  • save each test result
  • fix unk problems
  • train sentiment classification svm
  • add sentiment-blended word embeddings
  • test sentiment classify
  • use larger corpus
  • collect ROUGE

current effect

  • ROUGE files collected in the './ROUGE_ANSWER'

Releases

No releases published

Packages

No packages published

Languages