Skip to content

Latest commit

 

History

History
23 lines (14 loc) · 825 Bytes

README.md

File metadata and controls

23 lines (14 loc) · 825 Bytes

ELIT Tokenizer

ELIT (Emory Information and Language Technology) features an English tokenizer that splits text into a sequence of tokens and segment them into sentences using lexicon-based heuristics. This project is led by the Emory NLP Research Laboratory and under the Apache 2.0 license.

  • Latest release: 1.0 (10/15/2021)

Installation

Python 3.7 or higher is recommended:

pip install elit_tokenizer

Documentation

Contact