Skip to content

Created topic models with tf-idf for 142 unlabled news articles using the Stanford Core NLP library. With the resulting data, implemented clustering and classification algorithms (K-means and KNN) from scratch.

jaewhyun/text_analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Text Analytics (Java)

Environment

$ mkdir bin
$ export CLASSPATH=bin/calctfidf/:bin/k_means/:bin/knn/:bin/TextAnalytics:bin:lib/JavaML/:lib/stanford-corenlp-full/:lib/Jama/*:.

TF-IDF / Topics per Folder

Run

$ javac -d bin -sourcepath src src/calctfidf/Main.java && java calctfidf.Main Data/DataSet

K-Means Clustering

Run

$ javac -d bin -sourcepath src src/k_means/Main.java && java k_means.Main

K-nearest Neighbour

Run

$ javac -d bin -sourcepath src src/knn/Main.java && java knn.Main Data/TestDataUnlabeled

About

Created topic models with tf-idf for 142 unlabled news articles using the Stanford Core NLP library. With the resulting data, implemented clustering and classification algorithms (K-means and KNN) from scratch.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages