Code written as a part of assignments for CSE556 Natural Language Processing taught by Dr. Tanmoy Chakraborty at IIIT Delhi in Monsoon 2018.
This course will cover a broad range of topics related to NLP, including basic text processing (such as tokenization, stemming), language modeling, morphology, syntax, dependency parsing, distributional and lexical Semantics, sense disambiguation, information extraction etc. We will also introduce underlying theory from probability, statistics, machine learning that are essential to understand fundamental algorithms in NLP such as language modeling, HMM etc. This course will end with more advanced topics in NLP such as stylometry analysis, sentiment analysis, named-entity disambiguation, machine translation etc. The term projects will provide opportunity to the students to get hands-on experience on designing different real-world NLP models.
- Introduction
- Regular Expressions, Text Normalization, and Edit Distance
- Morphology & Finite-state Transducers
- Probabilistic models & Spelling correction
- N-grams, smoothing and entropy
- HMM, Viterbi and A* decoding
- Word classes and POS tagging
- CFG for English and Parsing
- Semantics: Introduction & Distributional semantics
- Lexical semantics & Word Sense disambiguation
- Advance topics: Text classification, Information retrieval
- Advance topics: Sentiment analysis, Stylometry analysis
- Advance topics: Web mining, Named-entity disambiguation
Copyright (c) 2019 Aditya Chetan
For license information, see LICENSE or http://mit-license.org