On the challenges of learning with inference networks on sparse, high-dimensional data

Overview

This contains code to learn DLGMs/VAEs on sparse-non negative data (though it may be easily modified for other data types) while optimizing the variational parameters predicted by the inference network during learning. It implements the models and techniques detailed in the paper:

On the challenges in learning with inference networks on sparse, high-dimensional data 
Rahul G. Krishnan, Dawen Liang, Matthew Hoffman

Contact

For questions, email: Rahul G. Krishnan

Requirements

python2.7
numpy, scipy, nltk, gensim
theano
theanomodels

Setup

The repository is arranged as follows:

ipynb - Code to visualize plots/simulations/samples from the generative model
expt - Folders for experiments
optvaedatasets - Setup for datasets
optvaemodels - Code for the model
optvaeutils - Utility functions

Tutorial

To run the model on your own data, you will have to specify a dataset as follows. See newsgroups.py for an example of how to setup the dataset from scratch.

    dset = {}
    dset['vocabulary']= vocab # array of len V containing vocabulary
    dset['train']     = train_matrix #scipy sparse tensor of size Ntrain x dim_features
    dset['valid']     = valid_matrix #scipy sparse tensor of size Nvalid x dim_features
    dset['test']      = test_matrix #scipy sparse tensor of size Ntest x dim_features 
    dset['dim_observations'] = dset['train'].shape[1]
    dset['data_type'] = 'bow'

See the ipython notebook TrainingVAEsparse.ipynb for more details

Name		Name	Last commit message	Last commit date
Latest commit History 191 Commits
expt		expt
ipynb		ipynb
optvaedatasets		optvaedatasets
optvaemodels		optvaemodels
optvaeutils		optvaeutils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expt

expt

ipynb

ipynb

optvaedatasets

optvaedatasets

optvaemodels

optvaemodels

optvaeutils

optvaeutils

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

init.py

init.py

Repository files navigation

On the challenges of learning with inference networks on sparse, high-dimensional data

Overview

Contact

Requirements

Setup

Tutorial

About

Releases

Packages

Languages

License

rahulk90/vae_sparse

Folders and files

Latest commit

History

Repository files navigation

On the challenges of learning with inference networks on sparse, high-dimensional data

Overview

Contact

Requirements

Setup

Tutorial

About

Topics

Resources

License

Stars

Watchers

Forks

Languages