Topic Annotator

This Scala library can perform common preprocessing tasks on a corpus, and then run it through one of several topic models (implemented using Gibbs sampling) to produce an annotated output. It is not yet ready for general use, but should help to simplify the format wrangling needed to test different topic models on a corpus.

Check out the org.chrisjr.topic_annotator.App class or the various tests for sample usage.

Preprocessing options:

regex tokenization
lowercasing
TF-IDF filtering
stoplists

Topic models:

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
src		src
.gitignore		.gitignore
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Topic Annotator

About

Uh oh!

Releases 1

Packages

Languages

corajr/topic-annotator

Folders and files

Latest commit

History

Repository files navigation

Topic Annotator

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages