Skip to content
forked from clips/topbox

Python 3 wrapper around the Stanford Topic Modeling Toolbox. Intended to be used for hassle-free supervised topic classification with Labeled Latent Dirichlet Allocation (L-LDA, LLDA, sLDA).

License

Notifications You must be signed in to change notification settings

jonaschn/topbox

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

topbox

A small Python 3 wrapper around the Stanford Topic Modeling Toolbox (STMT) that makes working with L-LDA a bit easier; no need to leave the Python environment. More information on its workings can be found here.

Setting up

Docker Setup

On Linux, this would look something like this:

git clone https://github.com/jonaschn/topbox
cd ~/topbox/box
wget http://nlp.stanford.edu/software/tmt/tmt-0.4/tmt-0.4.0.jar
cd ..
docker build -t jonaschn/topbox:latest .
docker run -v `pwd`:/opt/topbox -it jonaschn/topbox:latest /bin/bash

You can run the script with python test.py to test if it's working.

Manual Setup

You need to have an old Java SDK, version 6 or 7. Otherwise it will not work.

About

Python 3 wrapper around the Stanford Topic Modeling Toolbox. Intended to be used for hassle-free supervised topic classification with Labeled Latent Dirichlet Allocation (L-LDA, LLDA, sLDA).

Topics

Resources

License

Stars

Watchers

Forks

Languages

  • Python 78.9%
  • Scala 18.4%
  • Dockerfile 2.7%