Part of speech tagging

Part of Speech Tagging project from Udacity's NLP nanodegree.

In corpus linguistics, part-of-speech tagging (POS tagging or PoS tagging or POST), also called grammatical tagging is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech,[1] based on both its definition and its context. A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc.

This project uses the Pomegranate library to build a hidden Markov model for part of speech tagging with a universal tagset. Hidden Markov models have been able to achieve >96% tag accuracy with larger tagsets on realistic text corpora. Hidden Markov models have also been used for speech recognition and speech generation, machine translation, gene recognition for bioinformatics, and human gesture recognition for computer vision, and more.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
CODEOWNERS		CODEOWNERS
HMM Tagger-zh.html		HMM Tagger-zh.html
HMM Tagger-zh.ipynb		HMM Tagger-zh.ipynb
HMM Tagger.html		HMM Tagger.html
HMM Tagger.ipynb		HMM Tagger.ipynb
HMM warmup (optional)-zh.html		HMM warmup (optional)-zh.html
HMM warmup (optional)-zh.ipynb		HMM warmup (optional)-zh.ipynb
HMM warmup (optional).html		HMM warmup (optional).html
HMM warmup (optional).ipynb		HMM warmup (optional).ipynb
LICENSE		LICENSE
README.md		README.md
_example.png		_example.png
_post-hmm.png		_post-hmm.png
brown-universal.txt		brown-universal.txt
example.png		example.png
helpers.py		helpers.py
hmm-tagger.yaml		hmm-tagger.yaml
tags-universal.txt		tags-universal.txt

License

CostaFernando/part-of-speech-tagging

Folders and files

Latest commit

History

Repository files navigation

Part of speech tagging

About

Topics

Resources

License

Stars

Watchers

Forks

Languages