myBrillTagger

My version of the famous Brill's POS tagger, written in vanilla Javascript. Part of speech taggers (POS taggers) are programs that take a text and try to find to which grammatical entity (noun, verb, etc) each word belongs. https://en.wikipedia.org/wiki/Part-of-speech_tagging

This is useful when trying to somehow understand what a text is about. Yet it's not a semantic analysis. Usually, POS taggers use statistical techniques like HMM. Brill's tagger uses a simpler technique, certainly not the most efficient, but easy to understand. https://en.wikipedia.org/wiki/Brill_tagger

It starts by labeling all POS as nouns, then it tries to refine the tagging by correcting incongruities. For example in English, it converts a noun_tag to a past participle if the noun ends with "ed".

Brill's tagger has only a few rules, mine has many more, but I may have made a lot of errors, English is not even my native tongue. I use three characters for tag mnemonics, whereas Brill used a mix of 2 and 3 characters. While Brill used the Brown corpus, I uses a slightly different one.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
BrillTagger.js		BrillTagger.js
LICENSE		LICENSE
README.md		README.md
lexicon.js		lexicon.js
mnemonics.txt		mnemonics.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BrillTagger.js

BrillTagger.js

LICENSE

LICENSE

README.md

README.md

lexicon.js

lexicon.js

mnemonics.txt

mnemonics.txt

Repository files navigation

myBrillTagger

About

Releases

Packages

Languages

License

JPLeRouzic/myBrillTagger

Folders and files

Latest commit

History

Repository files navigation

myBrillTagger

About

Resources

License

Stars

Watchers

Forks

Languages