Skip to content

fthbrmnby/Text-Preprocess

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text-Preprocess

Text preprocessing pipeline for my graduation project. Pipeline includes sentence boundary detection, sentence tokenizer, stemmer, disambugiator and POS TAG. This pipeline uses Turkish NLP library zemberek-nlp by Ahmet A. Akın and Turkish Deasciifier for Java by Ahmet Alp Balkan.

Dataset

Type Number of Reviews
Positive 220,284
Negative 14,881

Requirements

  • JAVA 8
  • Maven

Releases

No releases published

Packages

No packages published

Languages