Skip to content

Latest commit

 

History

History
25 lines (16 loc) · 1.73 KB

README.md

File metadata and controls

25 lines (16 loc) · 1.73 KB

jsmf-raw

Joint Stochastic Matrix Factorization (JSMF) for the Rectified Anchor Word (RAW) algorithm.

Co-occurrence information is powerful statistics that can model various discrete objects by their joint instances with other objects. Transforming unsupervised problems of learning low-dimensional geometry into provable decompositions of co-occurrence information, spectral inference provides fast algorithms and optimality guarantees for non-linear dimensionality reduction and latent topic analysis. Spectral approaches reduce the dependence on the original training examples, thereby producing substantial gain in efficiency, but at costs:

  • The algorithms perform poorly on real data that does not necessarily follow underlying models.
  • Users can no longer infer information about individual examples, which is often important for real-world applications.
  • Model complexity rapidly grows as the number of objects increases, requiring a careful curation of the vocabulary.

The first issue is called model-data mismatch, which is a fundamental problem common in every spectral inference method for latent variable models. As real data never follows any particular computational model, this issue must be addressed for practicality of the spectral inference beyond synthetic settings.

The rectification paradigm in this code provides a neat solution to handle model-data mismatch not making more complex models but transforming data to a point in the space of ideal posterior.

Reference

[Main paper]

[Background paper]

[Related work]