Skip to content

rahulk90/vae_sparse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

On the challenges of learning with inference networks on sparse, high-dimensional data

Overview

This contains code to learn DLGMs/VAEs on sparse-non negative data (though it may be easily modified for other data types) while optimizing the variational parameters predicted by the inference network during learning. It implements the models and techniques detailed in the paper:

On the challenges in learning with inference networks on sparse, high-dimensional data 
Rahul G. Krishnan, Dawen Liang, Matthew Hoffman

Contact

Requirements

Setup

The repository is arranged as follows:

Tutorial

To run the model on your own data, you will have to specify a dataset as follows. See newsgroups.py for an example of how to setup the dataset from scratch.

    dset = {}
    dset['vocabulary']= vocab # array of len V containing vocabulary
    dset['train']     = train_matrix #scipy sparse tensor of size Ntrain x dim_features
    dset['valid']     = valid_matrix #scipy sparse tensor of size Nvalid x dim_features
    dset['test']      = test_matrix #scipy sparse tensor of size Ntest x dim_features 
    dset['dim_observations'] = dset['train'].shape[1]
    dset['data_type'] = 'bow'

See the ipython notebook TrainingVAEsparse.ipynb for more details